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EDITORIAL 


Innovate beyond PFAS 


ew proposed legislation on “forever“ chemicals 
is under consideration in Europe and the United 
States, where per- and polyfluoroalkyl substances 
(PFAS) are a hot topic for regulators and lawmak- 
ers. On both sides of the Atlantic, regulation of 
widely used PFAS has been complex and evolv- 
ing. Their presence in hundreds of different prod- 
ucts—from nonstick cookware to food packaging to fire- 
fighting foam—and their persistence in food, drinking 
water, and the environment have resulted in a pollution 
problem of unprecedented scale. Recently, for example, 
it was reported that 45% of the tap water in the United 
States contains at least one type of PFAS. Because these 
compounds are so chemically stable that they do not de- 
grade in the environment (including in the human body), 
PFAS seriously challenge long-es- 
tablished ideas of how chemicals 
can be used, assessed, and regu- 
lated, and it remains to be seen 
whether the new regulations will 
solve this problem. 

Chemicals assessment tradi- 
tionally has been centered around 
toxicity and physical hazards such 
as flammability. Chemicals that 
are carcinogenic, mutagenic, or 
toxic for reproduction (so-called 
CMR chemicals), as well as chemi- 
cals with high acute toxicity such 
as many neurotoxicants, stand 
out as particularly hazardous sub- 
stances that should be avoided 
by all means. Chemicals of intermediate toxicity (includ- 
ing many PFAS), by contrast, have not been seen as an 
outstanding concern. However, this view turns out to 
be deceptive and dangerous if such chemicals are very 
persistent, as is the case for PFAS. Persistence has been 
seen as a property that merely indicates the presence 
of a chemical in a given environment. It may seem that 
persistent chemicals are inert and thus relatively benign. 
However, they still have many ways to interfere with an 
organism’s physiology and cause adverse effects. There- 
fore, persistence is a property that makes the toxicity of 
any chemical much worse because it leads to—as long 
as uses and corresponding emissions are ongoing—ever- 
increasing concentrations, and toxic effects will manifest 
at some point. In other words, persistence acts as a mul- 
tiplier of toxicity. This insidious aspect of persistence has 
been underestimated in chemicals assessment for a long 
time, and now in the case of PFAS, it has hit home. 


“,. persistence 
acts as a multiplier 
of toxicity. 


This insidious 
aspect...has been 
underestimated...” 


The implications are substantial. One aspect is that 
chemicals that are only moderately toxic, but highly 
persistent, cannot be used in open and dispersive ap- 
plications as has been the case for PFAS, but have to 
be used in closed systems, such as industrial equip- 
ment without any leaks or vents (which is required for 
highly toxic chemicals). Another aspect is that persis- 
tence does not carry sufficient weight in the assess- 
ment and regulation of chemicals. Persistence should 
be seen as a direct element of chemical hazard. The 
current approach of treating persistence only as a 
factor that modulates exposure to a chemical is not 
adequate. Under this approach, low persistence leads 
to lower estimated exposure and, thereby, a rating of 
lower risk in current chemicals assessment, whereas 
high persistence does not lead to 
a “red flag.” 

Accordingly, the way forward 
should include changes to the 
established system of chemicals 
assessment and regulation that 
go beyond the case of PFAS. For 
the specific problem of PFAS, 
it will be necessary to develop 
PFAS-free alternatives for many 
of the current PFAS uses. In 
general, this is possible for the 
vast majority of cases. Even for 
challenging and demanding uses 
such as fire-fighting foams for 
jet-fuel fires, it has been possible 
to develop fluorine-free alterna- 
tives. Research is also underway, for example, in the 
area of battery development, and PFAS-free options 
are already available. 

However, we may be suffering from a “lock-in” of 
PFAS use in many applications as a result of their high 
performance and versatily. This has made them con- 
venient as a choice in materials development and as 
components in industrial processes. Yet, it is clear that 
alternatives to PFAS can be found, and the lock-in of 
PFAS actually may be a roadblock to innovation. Inno- 
vation beyond PFAS should be a call to arms to chem- 
ists, material scientists, product designers, and process 
engineers, but also downstream users of chemicals in 
many sectors who have to define product requirements. 
Alternatives are technically feasible in a wide range of 
cases and offer a pathway toward a more sustainable 
chemistry and a safer world. 

—Martin Scheringer 
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Montana State University geographer Cascade Tuholske, in The Washington Post, 
as records for high temperatures were approached or exceeded in China, California, and elsewhere. 


Edited by Jeffrey Brainard 


Aman with tuberculosis undergoes an electrocardiogram at an Indian clinic that treats drug-resistant TB. 


Drugmaker expands access to TB drug 


he pharmaceutical company Johnson & Johnson (J&J) last week 

agreed to help make a therapy critical to fighting drug-resistant 

tuberculosis (TB) more widely available and affordable. J&J 

said it would allow competitors to market generic versions of 

the lifesaving drug, bedaquiline, in 44 low- and middle-income 

countries (LMICs) where the company has active patents. The 
United Nations-affiliated Stop TB Partnership’s Global Drug Facility, 
which announced the deal with J&J, now sells the drug in LMICs for 
$45 per month for a 6-month course; generic versions are estimated 
to cost $8 to $17 per month. The announcement came 2 days after 
bestselling fiction author John Green assailed the company on social 
media, claiming it made minor modifications to the drug to extend 
its patent protection in the 44 LMICs. J&J called that “false.” India’s 
patent office denied J&J a patent extension there in March. The expi- 
ration of a key J&J patent this week is expected to make generic ver- 
sions available in dozens of additional LMICs. 
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Panel wants to cut NIH budget 


FUNDING | Congressional spend- 

ing committees delivered sobering 
messages last week to the two biggest 

U.S. federal research agencies. A panel 

in the Republican-controlled House of 
Representatives wants to reduce the cur- 
rent $47.4 billion budget of the National 
Institutes of Health (NIH) by $2.8 bil- 

lion or 6% for the 2024 fiscal year, which 
starts in October. The National Science : 
Foundation (NSF) would fare better under 
proposals from House and Senate com- 
mittees but would receive less than NSF’s 
current budget of $9.88 billion and almost 
$2 billion below its request for 2024. The 
NIH spending bill also blocks research 
using fetal tissue from elective abortions, 
funding for labs in China, and support for 
the EcoHealth Alliance, a nonprofit that has 
taken heat for collaborating with Wuhan 
virologists whom some blame for the 
COVID-19 pandemic. 


Europe lawmakers OK nature bill 


PoLicy | The European Parliament last 
week narrowly approved a law to restore 
degraded ecosystems, which had drawn 
support from scientists and opposition 
from some farmers’ groups and the body’s 
largest political bloc. The vote was 336 for 
and 300 against, with 13 abstentions. The 
Nature Restoration Law directs countries 
to establish recovery measures for 20% 

of the European Union’s land and sea “ 
areas by 2030 and all ecosystems in need 

of restoration by 2050. The European 
Commission says more than 80% of 

habitats are in “bad or poor” conservation 
status. The European Parliament will now 
negotiate with the Commission and the 
Council of the EU, which represents mem- 
ber states, to finalize the rules. 


Gene blocks COVID-19 symptoms 


PUBLIC HEALTH | About 20% of people 
infected by SARS-CoV-2 don’t develop 
symptoms, and researchers have discovered 
one reason why: They are more than twice 
as likely as people who get sick to carry a 
particular version of HLA-B, a key immune 
system gene. HLA-B codes for a cell-surface 
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A genetic mutation that has 
enabled insects to eat toxin-laden 
plants also permits ladybird 
beetles to prey on aphids that 
secrete toxins. 


Ancient genetic trick enables insects to resist toxins 


ore than 20 years ago, scientists pinpointed a mutation 
that helps wild fruit flies and other insects resist commer- 
cial insecticides. It's not a new trick, researchers reported 
this week in Nature Ecology & Evolution: Over the past 
300 million years, moths, butterflies, mealy bugs, and 
aphids have employed the same genetic adaptation to sidestep 
the deadly natural chemical defenses deployed by plants. The 
genetic changes occurred in a single gene that codes for a protein 


protein that alerts the immune system to 
invading viruses. The protective HLA-B 
variant may allow immune cells to purge 
infected cells rapidly and eradicate the virus 
before a person becomes ill, the scientists 
report this week in Nature. When they tested 
blood samples collected from people with this 
variant before the COVID-19 pandemic, they 
determined that 75% carried immune system 
T cells already primed to attack SARS-CoV-2, 
likely because of exposure to common 
coronaviruses, relatives of SARS-CoV-2 that 
cause colds. The HZA-B variant doesn’t 
explain all asymptomatic COVID-19 infec- 
tions—only 20% of people in the study who 
remained symptom-free carried it. But the 
result could help researchers refine COVID- 
19 vaccines and develop new treatments. 


Women instructors disclose stigma 


WORKPLACE | Demonstrating an openness 
that could benefit undergraduates, women 

college instructors in science and engineer- 
ing are more likely than men to share with 

their students personal details potentially 
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carrying stigma, a study has found. A team 
surveyed more than 2000 faculty members 
and instructors at U.S. research-intensive 
institutions about concealable, sensitive 
aspects of their identity. These included 
sexual orientation, growing up with low 
income and being a first-generation college 
student, and others. After the study team 
controlled for race, age, and seniority, they 
found that women survey respondents 
reported higher rates of depression, anxiety, 
and other concealable disabilities than men 
and perceived more stigma associated with 
all the concealable identities listed in the 
survey. But the women were also 1.46 times 
more likely to reveal them to students. The 
authors of the study, published this week 
in PLOS ONE, suggest that instructors who 
disclose these personal details can serve as 
role models for inclusivity in science. 


Ahard road for nonnative English 


WORKPLACE | The predominance of 
English in scientific communications 
makes it difficult for many nonnative 


called the GABA receptor, which acts as a landing site for a key 

nerve signaling compound called gamma-aminobutyric acid. 
Insecticides and many natural toxins act by binding to the same 
receptor. It was a classic arms race: Plants evolved toxins and, ‘ 
freed from insect predation, diversified. Then, genetic muta- 

tions and gene duplications altered the GABA receptor, enabling 

insects to eat these plants with impunity; fed by this broader 

menu, the insects, too, diversified, the researchers suggest. 


speakers around the world to complete 
routine professional tasks, new research 
shows. The effects are more pronounced 
in low-income countries. A research 
team surveyed 908 environmental 
scientists from eight nationalities and 
found that nonnative English speakers 
reported taking up to twice as long as : 
native English speakers to read a paper 

in that language. Manuscripts submit- 

ted to journals by nonnative speakers 

were 2.6 times as likely to be rejected. 
Authors were also 12 times more likely 

to be asked for revisions because of 

the quality of the English writing, the 

team reports this week in PLOS Biology. 
Almost half of nonnative speakers 

decided not to attend conferences con- 
ducted in English, and of those who did 

go, about one-third avoided giving an 

oral presentation. The authors call for 
remedies such as wider use and accep- 
tance by journals of automated language 
translators. (A Science interview with 

two co-authors of the study is at 
https://scim.ag/nonnativeEnglish.) 
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Polymetallic nodules, pictured here dotting the seabed under a carnivorous sponge, contain valuable metals that are in demand by the electric vehicle industry. 
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Prospect of unregulated deep-sea mining looms 


Seabed authority fails to finish rules in time, opening the door to the first mining license 


By Erik Stokstad 


high-stakes effort to hash out en- 

vironmental regulations governing 

mining of the seabed in international 

waters ended without agreement last 

week, after negotiators missed a July 

deadline. The lack of progress by the 
International Seabed Authority (ISA) raises 
an alarming prospect, observers say: that 
mining operations targeting metals needed 
by the electric vehicle industry could 
commence without effective regulations 
in place. 

“The ISA has just passed into uncharted 
territory,’ says Matthew Gianni, a policy 
adviser for the Deep Sea Conservation Co- 
alition who attended the meeting in Kings- 
ton, Jamaica. It’s a situation that could 
put the health of deep-sea ecosystems at 
risk—especially because so much remains 
unknown. “It will take a long time to really 
understand the biodiversity,’ says Patricia 
Esquete Garrote, a marine ecologist and 
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taxonomist with the University of Aveiro. 
Much of the discussion has centered 
on a remote region of the eastern Pacific 
Ocean called the Clarion-Clipperton Zone 
(CCZ), which stretches from south of Ha- 
waii to Mexico. Four kilometers down, the 
sea floor in the region is dotted with tril- 
lions of polymetallic nodules that are es- 


“It will take a long time to 
really understand the biodiversity.” 


Patricia Esquete Garrote, 
University of Aveiro 


timated to contain more cobalt and nickel 
than all known land deposits. Companies 
hope to mine the nodules by sucking them 
into bus-size seafloor mining machines 
and pumping them to the surface. But 
many living things have an earlier claim on 
the ancient nodules: They serve as habitat 
for sponges, polychaete worms, corals, and 


other deep-sea creatures. 

Scientists say mining operations could 
cause irreversible damage to the habitat 
because the nodules can take millions 
of years to form via precipitation of dis- 
solved minerals. Any creatures living ‘ 
on them would be killed. The operation . 
would also stir up sediment plumes, laced 
with metals, and generate noise and light 
that could spread harm far beyond the 
mining site. 

Managing these risks is the job of ISA, 
which regulates seabed mining beyond the 
limits of national jurisdiction. The inter- 
governmental body, which was established 
under a 1994 agreement by the United 
Nations, spent years developing rules for 
commercial exploration and has issued 
31 permits to mining companies and govern- 
ment agencies. One of those companies— 
Canadian-owned Nauru Ocean Resources— 
hopes to start mining operations in the CCZ 
in late 2024. 

Those plans jump-started the current 
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flurry of diplomatic activity. In 2021, the 
tiny island nation of Nauru, which serves 
as the company’s sponsor in ISA proceed- 
ings, invoked an obscure provision of in- 
ternational law, which holds that within 
2 years of being notified of intent to apply 
for permission to mine in the deep sea, ISA 
must complete its mining regulations. If not, 
it must “provisionally” approve a commer- 
cial permit despite the absence of final regu- 
lations. The draft regulations for full-scale 
mining had been in the works since 2011, 
but the pressure to finalize them intensified 
after Nauru triggered the rule and forced a 
9 July deadline. 

In March, negotiators failed to make 
much progress on the 
more than 60 pages 
of draft environmen- 
tal regulations. Last 
week, Gianni says, 
consensus emerged 
around a few issues, 
such as the need for 
regional environment 
management plans 
and = environmental 
standards—rules that 
would set acceptable 
amounts of noise, 
for example—before 
mining contracts are 
issued. The meeting 
won’t wrap up until 28 July, but the ses- 
sions focused on environmental regula- 
tions have ended and observers say ISA is 
unlikely to release regulations this month. 
One key question that remains, Gianni says, 
is whether any damage to deep-sea species 
and ecosystems would be permitted and, if 
so, how much it could be mitigated or offset. 

That’s a difficult issue to settle, scientists 
say, because so much remains unknown 
about deep-sea ecosystems and how min- 
ing operations might harm them. In May, 
for example, a paper found that thousands 
of species have yet to be identified in the 
CCZ. After examining some 100,000 records 
of specimens collected or observed during 
expeditions in the region, deep-sea ecologist 
Muriel Rabone of the Natural History Mu- 
seum in London found that just 436 species 
identified in the hauls have been officially 
named, suggesting more than 5500 others 
may remain to be described, she and her col- 
leagues reported in Current Biology. “With 
every sample, we see new species,” Rabone 
says. “What we could lose if we mine is a 
question we can’t really answer right now.” 

Another recent study looked at the linger- 
ing effects of deep-sea mining after the oper- 
ators and their equipment leave the area. In 
2020, researchers with the Geological Survey 
of Japan and other institutions conducted a 
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This sea cucumber lives in an area at risk 
from deep-sea mining. 


2-hour test of a small mining machine on 
the cobalt-rich Takuyo-Daigo seamount, 
about 1900 kilometers southeast of Tokyo. A 
year after the operations, fish and other mo- 
bile organisms were 43% less abundant in 
the mining area and 56% less in nearby ar- 
eas, up to 150 meters away. The study, which 
was published last week in Current Biology, 
was small and short, but it suggests the area 
clouded by sediment plumes could be larger 
than previously thought. “We may need to 
broaden what we think of in terms of deep- 
sea mining impacts,’ says Travis Washburn, 
a benthic ecologist now with the Washing- 
ton Department of Fish and Wildlife who 
led the study. “My biggest concern is moving 
forward on regula- 
tions too quickly, and 
our study reinforces 
that concern.” 

ISA will convene 
another meeting on 
30 October to try to 
nail down the envi- 
ronmental regula- 
tions, as well as other 
controversial details, 
such as how compa- 
nies will share finan- 
cial benefits with all 
of ISA’s member na- 
tions. A final set of 
rules likely won’t be 
ready for adoption even after that, Gianni 
says. All 36 countries with a seat on ISA’s 
Council would have to approve them before 
they go through, and “too many states are 
too far apart on many different aspects,” 
he says. 

Meanwhile, Nauru Ocean Resources could 
file its mining application. But Pradeep 
Singh, a lawyer and fellow at the Helmholtz 
Centre Potsdam’s Research Institute for Sus- 
tainability, doubts ISA will wave it through. 
A large majority of Council members would 
be opposed, he says, and the Council has 
taken the position it can deny an applica- 
tion submitted under the 2-year rule in the 
absence of regulations. 

A company spokesperson told Science that 
it would rather submit its application once 
the regulations are put into place. But if it 
or others lose patience with ISA, Gianni says 
they could find ways to advance an applica- 
tion because of the way voting works. When 
the members of the Council can’t unani- 
mously agree on the fate of an application, 
the decision goes to subgroups, where just 
a few members can force its approval. Na- 
uru is on the Council, as is Norway, which 
Gianni says is eager to proceed deep-sea 
mining. “A handful of countries can basi- 
cally hold the rest of the organization hos- 
tage,” he says. 


RESEARCH INTEGRITY 


Honesty papers 
retracted 

for data 
‘discrepancies’ 


After retractions, colleagues 
expand scrutiny of work 

by behavioral scientist 
Francesca Gino 


By Cathleen O’Grady 
ehavioral science researcher 
Francesca Gino has spent her 
accolade-studded career studying 


dishonesty. Her work, which includes 

influential studies on how dishonesty 

can fuel creativity and how people 
justify immoral behavior, has tens of thou- 
sands of citations and is frequently covered 
by the media. 

But over the past month, the Harvard 
Business School professor has faced allega- 
tions that her own research is dishonest. 
In June, data sleuths published a series of 
posts on their blog, Data Colada, detailing 
what they say is evidence of fraud in four of 
Gino’s papers. The bloggers say they alerted 
Harvard to the problems in 2021. (News of 
the alleged fraud was first broken by The 
Chronicle of Higher Education on 16 June.) 

Now, two of those papers have been re- 
tracted from the journal Psychological 
Science following an investigation by the 
Research Integrity Office at Harvard Busi- 
ness School. At least one more retraction is 
in the works. And this week, six of Gino’s . 
former co-authors launched a new initiative 
to figure out which of her other papers can 
still be deemed trustworthy. 

The researchers hope this project can 
help the community pick up the pieces in 
the wake of the misconduct scandal. Pa- 
pers by a suspected fraudster often land 
in a kind of purgatory, untrusted but not 
retracted because no one knows who col- 
lected the data or whether they are credible, 
says Uri Simonsohn, a behavioral scientist 
at Ramon Llull University who is one of the 
sleuths who uncovered the apparent fraud 
and also a former co-author of Gino’s. “That 
ambiguity seems really bad.” 

According to retraction notices published 
on 6 July, two Psychological Science papers 
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flurry of diplomatic activity. In 2021, the 
tiny island nation of Nauru, which serves 
as the company’s sponsor in ISA proceed- 
ings, invoked an obscure provision of in- 
ternational law, which holds that within 
2 years of being notified of intent to apply 
for permission to mine in the deep sea, ISA 
must complete its mining regulations. If not, 
it must “provisionally” approve a commer- 
cial permit despite the absence of final regu- 
lations. The draft regulations for full-scale 
mining had been in the works since 2011, 
but the pressure to finalize them intensified 
after Nauru triggered the rule and forced a 
9 July deadline. 

In March, negotiators failed to make 
much progress on the 
more than 60 pages 
of draft environmen- 
tal regulations. Last 
week, Gianni says, 
consensus emerged 
around a few issues, 
such as the need for 
regional environment 
management plans 
and = environmental 
standards—rules that 
would set acceptable 
amounts of noise, 
for example—before 
mining contracts are 
issued. The meeting 
won’t wrap up until 28 July, but the ses- 
sions focused on environmental regula- 
tions have ended and observers say ISA is 
unlikely to release regulations this month. 
One key question that remains, Gianni says, 
is whether any damage to deep-sea species 
and ecosystems would be permitted and, if 
so, how much it could be mitigated or offset. 

That’s a difficult issue to settle, scientists 
say, because so much remains unknown 
about deep-sea ecosystems and how min- 
ing operations might harm them. In May, 
for example, a paper found that thousands 
of species have yet to be identified in the 
CCZ. After examining some 100,000 records 
of specimens collected or observed during 
expeditions in the region, deep-sea ecologist 
Muriel Rabone of the Natural History Mu- 
seum in London found that just 436 species 
identified in the hauls have been officially 
named, suggesting more than 5500 others 
may remain to be described, she and her col- 
leagues reported in Current Biology. “With 
every sample, we see new species,” Rabone 
says. “What we could lose if we mine is a 
question we can’t really answer right now.” 

Another recent study looked at the linger- 
ing effects of deep-sea mining after the oper- 
ators and their equipment leave the area. In 
2020, researchers with the Geological Survey 
of Japan and other institutions conducted a 
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This sea cucumber lives in an area at risk 
from deep-sea mining. 


2-hour test of a small mining machine on 
the cobalt-rich Takuyo-Daigo seamount, 
about 1900 kilometers southeast of Tokyo. A 
year after the operations, fish and other mo- 
bile organisms were 43% less abundant in 
the mining area and 56% less in nearby ar- 
eas, up to 150 meters away. The study, which 
was published last week in Current Biology, 
was small and short, but it suggests the area 
clouded by sediment plumes could be larger 
than previously thought. “We may need to 
broaden what we think of in terms of deep- 
sea mining impacts,’ says Travis Washburn, 
a benthic ecologist now with the Washing- 
ton Department of Fish and Wildlife who 
led the study. “My biggest concern is moving 
forward on regula- 
tions too quickly, and 
our study reinforces 
that concern.” 

ISA will convene 
another meeting on 
30 October to try to 
nail down the envi- 
ronmental regula- 
tions, as well as other 
controversial details, 
such as how compa- 
nies will share finan- 
cial benefits with all 
of ISA’s member na- 
tions. A final set of 
rules likely won’t be 
ready for adoption even after that, Gianni 
says. All 36 countries with a seat on ISA’s 
Council would have to approve them before 
they go through, and “too many states are 
too far apart on many different aspects,” 
he says. 

Meanwhile, Nauru Ocean Resources could 
file its mining application. But Pradeep 
Singh, a lawyer and fellow at the Helmholtz 
Centre Potsdam’s Research Institute for Sus- 
tainability, doubts ISA will wave it through. 
A large majority of Council members would 
be opposed, he says, and the Council has 
taken the position it can deny an applica- 
tion submitted under the 2-year rule in the 
absence of regulations. 

A company spokesperson told Science that 
it would rather submit its application once 
the regulations are put into place. But if it 
or others lose patience with ISA, Gianni says 
they could find ways to advance an applica- 
tion because of the way voting works. When 
the members of the Council can’t unani- 
mously agree on the fate of an application, 
the decision goes to subgroups, where just 
a few members can force its approval. Na- 
uru is on the Council, as is Norway, which 
Gianni says is eager to proceed deep-sea 
mining. “A handful of countries can basi- 
cally hold the rest of the organization hos- 
tage,” he says. 
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By Cathleen O’Grady 
ehavioral science researcher 
Francesca Gino has spent her 
accolade-studded career studying 


dishonesty. Her work, which includes 

influential studies on how dishonesty 

can fuel creativity and how people 
justify immoral behavior, has tens of thou- 
sands of citations and is frequently covered 
by the media. 

But over the past month, the Harvard 
Business School professor has faced allega- 
tions that her own research is dishonest. 
In June, data sleuths published a series of 
posts on their blog, Data Colada, detailing 
what they say is evidence of fraud in four of 
Gino’s papers. The bloggers say they alerted 
Harvard to the problems in 2021. (News of 
the alleged fraud was first broken by The 
Chronicle of Higher Education on 16 June.) 

Now, two of those papers have been re- 
tracted from the journal Psychological 
Science following an investigation by the 
Research Integrity Office at Harvard Busi- 
ness School. At least one more retraction is 
in the works. And this week, six of Gino’s . 
former co-authors launched a new initiative 
to figure out which of her other papers can 
still be deemed trustworthy. 

The researchers hope this project can 
help the community pick up the pieces in 
the wake of the misconduct scandal. Pa- 
pers by a suspected fraudster often land 
in a kind of purgatory, untrusted but not 
retracted because no one knows who col- 
lected the data or whether they are credible, 
says Uri Simonsohn, a behavioral scientist 
at Ramon Llull University who is one of the 
sleuths who uncovered the apparent fraud 
and also a former co-author of Gino’s. “That 
ambiguity seems really bad.” 

According to retraction notices published 
on 6 July, two Psychological Science papers 
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co-authored by Gino were retracted after 
an independent forensic firm appointed 
by Harvard found “discrepancies” between 
published data sets and earlier versions of 
the data found in Gino’s research records. 

One of the papers, published in 2014, re- 
ported that people who lied about whether 
they accurately predicted the outcome of a 
coin toss were subsequently more creative. 
According to the journal retraction notice, 
the investigation found that early versions 
of a data set recorded 31 participants as 
having cheated on the coin-toss task— 
but the data used in the paper included 
43 cheaters, because 12 participants had 
been changed from noncheaters to cheaters. 
The data for a task testing creativity also 
contained values that had been manually 
entered instead of calculated 
using the same formula as 
other values. When the investi- 
gators corrected these “anoma- 
lies,’ the key findings no longer 
stood up, the retraction notice 
states. 

The second paper, published 
in 2015, found that people who 
were asked to behave inauthen- 
tically by writing an essay that 
argued against their own opin- 
ion felt “morally impure” after- 
ward. These participants rated 
a series of cleaning products to 
be much more desirable than 
those who were not made to feel 
“Impure,” suggesting that par- 
ticipants felt a need to cleanse 
themselves. 

According to the retraction 
notice, the Harvard investi- 
gation found that the published data set 
seemed to be a combination of two data 
files found in Gino’s records. But it didn’t 
contain all the participant data from these 
two files—and it also contained extra data 
not in those files. In fact, there were no 
clear criteria for including or excluding 
participant data in the final data set, the re- 
traction notice states, and the results didn’t 
replicate when the investigators combined 
the two data files using the researchers’ 
original protocols. 

The two retraction notices state that 
Gino’s legal representation told the journal 
that Gino “viewed the retraction[s] as neces- 
sary” but disputed references to original data 
because “there is no original data available.” 
The editor-in-chief of Psychological Science, 
Patricia Bauer, declined to comment. 

The Data Colada bloggers—Simonsohn 
and fellow behavioral scientists Joe 
Simmons of the Wharton School of the Uni- 
versity of Pennsylvania and Leif Nelson of the 
Haas School of Business at the University of 
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California, Berkeley—have identified irregu- 
larities in the data files of two further stud- 
ies, including a 2020 paper in the Journal 
of Personality and Social Psychology, which 
is published by the American Psychological 
Association (APA). They write that they 
have received confirmation “from outside of 
Harvard” that the university’s investigators 
found the original data from this paper had 
been modified. 

In a written statement to Science, APA 
Publisher Rose Sokol said APA is “aware 
of the concerns” about the paper, although 
she declined to comment on whether the 
Harvard investigation found evidence of 
manipulated data. A retraction is set to be 
published in September, Sokol said. Similar 
concerns about data in a 2012 Proceedings of 


Two of Francesca Gino’s papers have been retracted following an investigation 
by Harvard Business School. 


the National Academy of Sciences (PNAS) pa- 
per have not led to a retraction, because the 
paper was already retracted in 2021, after the 
Data Colada sleuths found apparent fraud in 
a separate study contributed by Dan Ariely 
of Duke University’s business school. 

Gino did not respond to repeated requests 
for comment. But on 24 June, she published 
a statement on LinkedIn saying: “As I con- 
tinue to evaluate these allegations and as- 
sess my options, I am limited into what I can 
say publicly. I want to assure you that I take 
them seriously and they will be addressed.” 
Gino’s faculty page says she is on “adminis- 
trative leave.” Harvard declined to comment. 

Gino’s co-authors had no role in collect- 
ing the data for the four suspect studies, ac- 
cording to the Data Colada bloggers. Now, 
the Many Co-Authors project aims to figure 
out whether the data for the rest of her pa- 
pers are reliable. This week, Simonsohn and 
five other Gino co-authors emailed nearly 
150 collaborators, asking them who collected 
and handled the data for the papers, which 


files are available for analysis, and whether 
they still consider the work to be credible. 
In September, the team will launch a website 
that tracks and updates the results. 

The project was motivated primarily by 
concern for junior researchers—like Har- 
vard Ph.D. students whose few papers may 
all have Gino listed as a co-author. Even 
if these students gathered reliable data 
themselves, they now face serious damage 
to their careers if the field unfairly loses 
trust in their papers, Simonsohn says: 
“That just seems so cruel.” He hopes the 
effort will result in a paper that research- 
ers can cite alongside Gino’s publications, 
to explain why they believe a given study 
to be credible. 

The alleged fraud has shocked and an- 
gered behavioral _ scientists. 
Syon Bhanot, a behavioral econ- 
omist at Swarthmore College, 
says fabricated data can seri- 
ously harm other researchers. 
Trying to build on a faked result 
consumes time, energy, and re- 
sources, and can leave research- 
ers struggling to publish their 
null findings. “It’s a huge cost,” 
he says. 

Michael Sanders, a public pol- 
icy researcher at King’s College 
London, says his team has spent 
about $250,000 running large 
field trials based on the effects 
of the 2012 PNAS paper, which 
found that people who sign an 
honesty declaration at the top 
of a form rather than the bot- 
tom behave more _ honestly. 
These efforts—which included 
a collaboration with the Guatemalan tax 
authority—were all wasted, he says, “be- 
cause we were digging in the wrong place.” 

Sanders and others would like to see 
the creation of well-funded institutions 
dedicated to investigating claims of fraud, 
which they say would be fairer and more 
efficient than using the limited resources 
of whistleblowers, journals, and universi- 
ties. But, “My intuition is that you cannot 
stop fraud by catching it,” Simonsohn says. 
“All efforts should go on prevention. Catch- 
ing it is so erratic.” 

To reduce how many people cheat in the 
first place, academia’s reward system needs 
to change, Sanders says. He points out that 
Gino has been one of Harvard’s highest 
paid professors. In its 2019 tax filings, the 
university reported Gino’s pay from Har- 
vard and “related organizations” at about 
$1 million. “If you win this tournament,” 
Sanders says, “the prize is enormous. And 
that makes the incentive to cheat really, re- 
ally high.” = 
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RESEARCH SECURITY 


DOD grantees could be subject 
to extensive public disclosures 


House defense bill would vastly expand requirements 


By Jeffrey Mervis 


he U.S. House of Representatives has 
approved new rules requiring anyone 
working on an academic research 
project funded by the Department of 
Defense (DOD) to disclose detailed 
personal information and work histo- 
ries on a public website. It’s the latest gambit 
by members of Congress who feel universities 
aren't doing enough to prevent China from 
stealing government-funded research. 

The new reporting regimen, if ultimately 
adopted by Congress, would go far beyond 
what DOD or any other research agency 
now requires. And that has raised concerns 
among scientists about government over- 
reach. “Yes, research security is a real is- 
sue,” says Alex Aiken, a computer scientist 
at Stanford University who tracks national 
research policy. “But this seems excessive. 
What purpose would it serve? And why 
should it all be made public?” 

In addition to posing a threat to privacy, 
Aiken and others worry the requirements 
would be counterproductive, creating new 
risks by generating a publicly available di- 
rectory of who is doing what type of DOD- 
funded research. 

President Joe Biden’s administration 
shares both those concerns. A 10 July White 
House statement opposing the amendment 
says it “would make public detailed informa- 
tion on all Department research performers 
that could create an inadvertent national 
security risk ... [and] could jeopardize the 
Department’s ability to fund universities in 
states with nondiscrimination laws that pro- 
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hibit citizenship and nationality reporting.” 

The proposed changes are tucked into the 
National Defense Authorization Act (NDAA), 
a massive bill that provides annual policy 
and spending guidance to the agency. The 
language about DOD grants is absent from a 
version of the bill the Senate began to debate 
this week, so its ultimate fate is uncertain. 
But Representative Jim Banks (R-IN), who 
authored the new provisions, calls them “a 
common sense approach.” 

“We need far more oversight of sensitive 
DOD research at our universities,” Banks 
said before the House Armed Services Com- 
mittee adopted his amendment last month 
on a straight party-line vote. “We never 
would have allowed Russian scientists to 
study rocketry during the Cold War. So 
why would we let scientists with ties to the 
Chinese Communist Party work on defense 
projects today?” 

Banks pitched the amendment as a way 
to thwart those bent on helping China. (The 
House version of the NDAA approved on 
14 July would also eliminate all DOD fund- 
ing to the EcoHealth Alliance, a nonprofit 
that worked with a Chinese virology institute 
before the pandemic.) But the disclosure re- 
quirements apply to anyone expected to have 
“access to information” relating to any DOD 
grant, including undergraduates, graduate 
students, and postdocs. Although Banks em- 
phasized that his amendment would prevent 
the loss of sensitive or classified research, it 
applies to university-based projects, almost 
none of which are classified. 

The amount of information being sought 
is vast. It starts with such personal details as 


Scientists question what Representative Jim Ban che 


(R-IN) calls his “common sense” approach to better —— 
research security. 


date and place of birth, immigration status, 
and a complete employment history, includ- 
ing “all previous and concurrent research, 
academic, and corporate positions, ties, or 
relationships.” Researchers would also need 
to list “all publications, anywhere and in any 
language,” as well any “foreign funding, re- 
search collaborations, and in-kind support.” 
In addition, the lead scientist on a grant 
would have to list “any direct, indirect, for- 
mal, or informal collaboration ... with any 
third-party persons or entities.” 

In opposing the amendment, the top 
Democrat on the committee, Representative 
Adam Smith (WA), didn’t dispute the impor- 
tance of research security. But he said Banks’s 
prescription was redundant. : 

“There are sufficient provisions already in 
place to protect our military secrets,’ Smith 
said before joining every Democrat in voting 
against the amendment. “Every fundamental 
research project already goes through a thor- 
ough risk assessment before it is awarded ... 
and there are severe restrictions on who can 
have access to classified research.” Asking for 
so much additional information, he added, 
“would have a chilling effect on our ability to 
collaborate with any foreign scientist.” 

Aiken worries that publicly disclosing 
what he calls “historical associations that 
are no longer active” could sully the reputa- 
tion of scientists who collaborated with Chi- 
nese colleagues during an earlier era when 
their universities strongly encouraged such 
partnerships. “I don’t think people should be 
punished retrospectively,’ he says. 

Faculty at Stanford and other universities 
already provide information about foreign 
collaborators to comply with existing rules 
for any federal grant, Aiken notes. But the 
Banks amendment asks for much more—and 
would broadcast it to the world. : 

“Public disclosure means foreign govern- . 
ments can use the information, too,’ Aiken 
says. “And I’m sure those countries would 
learn a great deal about the network of con- 
nections of the U.S. research community 
from these disclosures.” 

Science lobbyists didn’t mount a cam- 
paign against the Banks amendment when 
the NDAA, which contains controversial 
language on many social issues, was de- 
bated on the House floor. “We didn’t want 
to push for a vote that we could lose,” one 
lobbyist says. But the NDAA is one of the 
few must-pass pieces of legislation in Con- 
gress. And science advocates are hoping 
Senate negotiators will insist that Banks’s 
language be dropped from the final version 
of the bill. 
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SCIENTIFIC COMMUNITY 


Scientists pursued for massive conference fees _ 


Speakers at online COVID-19 debates battle claims they owe as much as €80,000 


By Michele Catanzaro 


hen Bjérn Johansson received an 

email in July 2020 inviting him 

to speak at an online debate on 

COVID-19 modeling, he didn’t 

think twice. “I was interested in 

the topic and I agreed to par- 
ticipate,” says Johansson, a medical doctor 
and researcher at the Karolinska Institute. 
“T thought it was going to be an ordinary 
academic seminar. It was an easy decision 
for me.” 

Three years later, Johansson has come 
to regret that decision. The Polish com- 
pany behind the conference, Villa Europa, 
claims he still owes them fees for taking 
part, and is seeking payment through a 
Swedish court. After adding legal costs 
and interest to the bill, the company is 
demanding a whopping €80,000. 

Johansson isn’t alone. Dozens of re- 
searchers participated in the same series 
of online conferences on COVID-19 in 
2020 and 2021 and many have received 
demands for payment from Villa Europa. 
At least five are being pursued through 
courts in their own countries for fees of 
tens of thousands of euros, although sev- 
eral researchers are fighting back. 

But the case is peppered with puzzling 
circumstances. In court filings and inter- 
views, the researchers say the demands 
are illegitimate and based on deceptive 
license agreements. Little is known about the 
individuals who organized the conferences. 
And many of the demands hinge on the rul- 
ing of a Polish arbitration court whose very 
existence has been questioned by experts in 
the country. 

Science has talked with 10 of the speakers, 
all of whom tell similar stories. In early 2020, 
somebody calling himself Matteo Ferensby, 
whose email signature mentioned the Uni- 
versity of Warsaw, invited them to speak at 
online webinars on the mathematical and 
computational modeling of COVID-19. 

The University of Warsaw has no em- 
ployee by that name, according to the insti- 
tution’s press office. And there is no track 
record of scientific publications from a 
Matteo Ferensby. 

But many scholars were convinced to join 
when they saw online that Ferensby had 
previously organized webinars on theoreti- 
cal physics, computer science, and theoreti- 
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cal geography. Serious scientists had joined 
those events, among them the 1999 Nobel 
laureate in physics, Gerard ‘t Hooft. And at 
this early stage of the pandemic, scientists 
“were sitting in shock in their homes and 
welcomed the possibility of ... improving the 
situation with their research,’ says Marco 
Baiesi, a physicist at the University of Padova 
who spoke at one event. 

At least 11 COVID-19 webinars took place 
between April 2020 and June 2021, accord- 
ing to a list compiled by one of the speakers, 
Axel Brandenburg, a physicist at the Nordic 
Institute for Theoretical Physics. The speak- 
ers themselves—about 10 people in each 
session—were the only audience, but partici- 
pants were told the recordings would be pub- 
lished open access afterward. (Science has 


Axel Brandenburg (left) and Bjorn Johansson are fighting claims 
they owe conference organizers thousands of euros. 


not been able to locate videos of the events 
that are publicly available online.) 

All the scientists interviewed by Science 
say Ferensby’s initial messages never men- 
tioned conference fees. When one speaker, 
Francesco Piazza, a physicist now at the Uni- 
versity of Florence, directly asked Ferensby 
whether the organizers would request a fee, 
Ferensby replied, “No, we are talking about 
science and COVID-19.” 

But after the events, the speakers were 
approached by a conference secretary, who 
asked them to sign and return a license agree- 
ment that would give Villa Europa—named in 
the document as the conference organizer— 
permission to publish the webinar record- 
ings. Most of the contracts Science has seen 
state that the researcher must pay the com- 
pany €790 “for webinar debate fees and open 
access publication required for the debate 
proceedings” plus €2785 “to cover editorial 
work.” These fees are mentioned in a long 


clause in the last page of the contract, and 
are written out in words rather than num- 
bers, without any highlighting. 

Many of the speakers, already busy study- 
ing COVID-19 and under pressure from the 
transition to remote teaching, did not no- 
tice these clauses. The pandemic “meant 
working 11 to 12 hours per day,’ says another 
speaker, Johannes Miiller, a mathematician 
at the Technical University of Munich. “The 
contract was unreadable [but] I eventually 
sent it.” 

Some researchers allege in court filings 
and interviews that they were sent back al- 
tered copies of their signed contracts con- 
taining an additional page where the fees are 
made explicit, and modified clauses, one of 
them stating that disputes can be settled by a 

Polish arbitration court. 

Then, several months after the events, 
some of the speakers received long let- 
ters signed by a person called Krzysztof 
Sienicki, CEO of Villa Europa, some in 
Polish. The letters sometimes demanded 
payments and late fees. 

At least 32 scholars in six countries 
have received these letters, according 
to Brandenburg. One researcher agreed 
to pay about €7200 to Villa Europa at 
the end of 2022, but many ignored the 
letters. Although some heard nothing, 
others—Brandenburg, Johansson, and 
other scientists in Sweden, Germany, 
and Spain—are facing new pressure. 

Each has received a letter from a local 
court informing them that Villa Europa has 
asked for the enforcement of a Polish arbitra- 
tion decision that found in favor of the com- 
pany. Villa Europa is claiming about €13,000 
to €25,000 from each researcher in fees, 
fines associated with payment delay, and 
court costs. (What boosted Johansson’s fee to 
€80,000 is a demand for payment for code 
that Villa Europa used to edit an animation 
displayed in his talk.) 

But the legitimacy of the Polish arbitra- 
tion court, Pan-Europejski-Sad-Arbitrazowy 
(PESA), has been questioned. Agnieszka 
Durlik, director general of the Court of Arbi- 
tration of the Polish Chamber of Commerce, 
says she has never heard of it. Moreover, un- 
til recently the online tool who.is indicated 
that PESA’s website was set up by Villa Eu- 
ropa itself in September 2021. Durlik lists 
several other procedural anomalies. “In my 
opinion this is fraud,’ she says. It would not 
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be unprecedented: In 2019, 10 people were re- 
portedly charged in Poland for extorting com- 
panies using a nonexistent arbitration court. 

Science reached out by email to both 
Ferensby and Sienicki. Somebody signing as 
“COVID-19 Team” replied asking for written 
questions. Science sent questions but received 
no answers. 

Villa Europa has fallen under suspicion 
before. In 2018, the American Chemical So- 
ciety (ACS) requested arbitration after the 
company created a website (chemarxiv.org) 
whose URL resembled that of the preprint re- 
pository co-owned by ACS (chemrxiv.org). An 
arbitration court in the United States ruled 
that Villa Europa was “attempting to divert 
Internet users ... for commercial gain” and 
ordered the company to transfer the domain 
name to ACS. 

Participants in the earlier conferences run 
by Ferensby also report that the company 
billed them after the events—though the fees 
it demanded were only about €300, according 
to one speaker. “A Polish organization sent a 
bill [for] a conference that had been replaced 
by a Zoom meeting. It was ridiculous so I ig- 
nored it,” says ‘t Hooft, who participated in an 
earlier webinar. 

For the researchers now under pressure 
from the courts, ignoring the demands is not 
an option. They have all submitted court fil- 
ings supporting their case. Filings seen by 
Science argue the demands are illegitimate 
and that they were deceived about what they 
were signing in the contracts. Brandenburg, 
for instance, has submitted documents to the 
Swedish court questioning the legitimacy of 
PESA and the connection of Ferensby with 
the University of Warsaw. 

Lina Forzelius, the judge in charge of that 
case, says a decision may be taken in Septem- 
ber. If the researchers show the demand is 
clearly incompatible with Swedish law—for 
example, if it becomes clear that the arbitra- 
tion decision is fake—the court may consider 
not enforcing it, Forzelius says. 

Even if the courts rule in the researchers’ 
favor, many say the experience has shaken 
them and made them distrustful of other 
scientific organizations. “I was invited by the 
Fields Institute for [Research in Mathemati- 
cal] Sciences to give an online talk about my 
paper, and so I asked them if they are real,” 
says Miiller, who later realized his reply 
was “embarrassing.” 

And if the courts rule for Villa Europa, 
more researchers may face the same ordeal, 
Miller says. “If they win, they can go to oth- 
ers like me.” 


Michele Catanzaro is a journalist based in Barcelona, 
Spain. This story was supported by the Science 

Fund for Investigative Reporting and a grant from 
Freelance Investigative Reporters and Editors. 
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How good a physicist was the 
architect of the A-bomb? 


Oppenheimer “was no Einstein,” says historian David C. 
Cassidy, but he did Nobel-level work on black holes 


By Adrian Cho 


his week, the much anticipated movie Oppenheimer hits theaters, giving famed 
filmmaker Christopher Nolan’s take on the theoretical physicist who dur- 
ing World War II led the Manhattan Project to develop the first atomic bomb. 
J. Robert Oppenheimer, who died in 1967, is known as a charismatic leader, elo- 
quent public intellectual, and Red Scare victim who in 1954 lost his security clear- . 
ance in part because of his earlier associations with Communists. To learn about 
Oppenheimer the scientist, Science spoke with David C. Cassidy, a physicist and histo- 
rian emeritus at Hofstra University. Cassidy has authored or edited 10 books, including 
J. Robert Oppenheimer and the American Century. The following has been edited for 


length and clarity. 


Q: Oppenheimer’s name appears in the 
early applications of quantum mechanics 
and the theory of black holes. How good 
a physicist was he? 

A: Well, he was no Einstein. And he’s not 
even up to the level of Heisenberg, Pauli, 
Schrédinger, Dirac, the leaders of the 
quantum revolution of the 1920s. One of 
the reasons for this was his birth date. He 
was born in 1904, so he was 3 years younger 
than Heisenberg, 4 years younger than 


Pauli. Those few years were enough to place 
him in the second wave of the quantum 
revolution and behind the main wave of 
discovery, in what [philosopher of science] 
Thomas Kuhn called the “mopping-up op- 
eration,” applications of the new theory. 


Q: He’s known for the Born-Oppenheimer 

approximation, which helped extend quan- 
tum mechanics from atoms to molecules. 
A: That was one of his most cited papers. 
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length and clarity. 


Q: Oppenheimer’s name appears in the 
early applications of quantum mechanics 
and the theory of black holes. How good 
a physicist was he? 

A: Well, he was no Einstein. And he’s not 
even up to the level of Heisenberg, Pauli, 
Schrédinger, Dirac, the leaders of the 
quantum revolution of the 1920s. One of 
the reasons for this was his birth date. He 
was born in 1904, so he was 3 years younger 
than Heisenberg, 4 years younger than 


Pauli. Those few years were enough to place 
him in the second wave of the quantum 
revolution and behind the main wave of 
discovery, in what [philosopher of science] 
Thomas Kuhn called the “mopping-up op- 
eration,” applications of the new theory. 


Q: He’s known for the Born-Oppenheimer 

approximation, which helped extend quan- 
tum mechanics from atoms to molecules. 
A: That was one of his most cited papers. 
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He wrote that in 1927 while he was in Géot- 
tingen [Germany, doing his doctoral work 
with Max Born]. That same year, 
Heisenberg presented the uncertainty 
principle. Bohr and Heisenberg put out 
the Copenhagen interpretation [of quan- 
tum mechanics]. So here’s Oppenheimer 
doing an application, but a good one 
because it helped introduce quantum 
perturbation theory. 


Q: Even some of his contemporaries said he 
was a dilettante. How good he was in terms 
of raw skill? 

A: He had the skill and the brilliance. But 
he didn’t have the focus. He was not abso- 
lutely devoted to physics the way one of the 
great physicists would be. It was just one of 
his many passions. At the time he was doing 
physics, he read a lot of literature and lan- 
guages. Also, in the U.S., the empirical way 
of approaching physics was predominant 
[whereas European theorists were pursuing 
new concepts]. So the theorists’ job was to 
help experimentalists understand their data. 
As the physics and the experiments were 
shifting, his interest shifted, too. 

One of his main contributions had only a 
tenuous connection to observation, and that 
was black holes. That was an unfortunate 
situation. In 1939, he and a student, 
Hartland Snyder, published a paper predict- 
ing [collapsing stars could form] black holes. 
They couldn’t pursue it because the war was 
breaking out. A lot of people just ignored it 
because it seemed impossible—how could 
anything collapse to an infinitely dense 
point?—until [physicist John] Wheeler re- 
vived the matter in the 1960s. Not until the 
1990s was there any experimental evidence 
for black holes. I think Oppenheimer would 
have gotten a Nobel Prize if he was still alive 
at that point. 


Q: How did Oppenheimer, a theorist, end up 
directing the Manhattan Project, a gigantic 
experiment? 

A: It was even worse. Oppenheimer had no 
administrative experience. No Nobel Prize, 
unlike many of the people whom he would 
be administering. And worst of all, he 

had a doubtful political background, with 
associations with known Communists in 
the late 1930s. But [Lt. Gen. Leslie] Groves 
picked him specifically. First of all, because 
of Oppenheimer’s grasp of the physics and 
his ability to explain it to him. Also, because 
Oppenheimer was highly respected by the 
other physicists. But the main reason was 
Groves knew that Oppenheimer would be 
permanently vulnerable because of his po- 
litical associations. Groves suppressed a lot 
of the security agents’ reports on him and 
said, “I want this man for the job.” So, 
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Lt. Gen. Leslie 
Groves (right) 
chose J. Robert 
Oppenheimer 

to lead the atomic 
bomb project. 


Oppenheimer knew he was there only be- 
cause he was under Groves’s protection. 


Q: Ernest Lawrence, inventor of the cyclo- 
tron, was an experimentalist, relatively 
conservative, and free of political baggage. 
But Groves was not impressed with him. 
Why not? 

A: To be director, you had to understand 
the theory of the bomb inside and out. 
The other thing was Lawrence gave the 
impression that he was his own man. This 
was a military project and Groves wanted 
people under him who would accept being 
part of the chain of command. I suspect 
he didn’t feel that Lawrence would be able 
to do that. 


Q: Did Oppenheimer make specific techni- 
cal contributions to the bomb’s design? 

A: In one very important way. In 1942, [Pres- 
ident Franklin D.] Roosevelt ordered a crash 
program for the bomb. Arthur Compton 
selected Oppenheimer to head a theory 
group at [the University of California] 
Berkeley to work out all the details—what 
they would need, how they would do it. 

The group handed the results to Compton, 
and the Manhattan Project was born. When 
scientists arrived at the laboratory [in Los 
Alamos, New Mexico], they were given a 
series of lectures by Oppenheimer’s closest 
assistant, Robert Serber, on how the bomb 
would work based on that research. So it’s 


Oppenheimer and his group who are setting 
up the whole theory for the project. 


Q: So, in the modern parlance, he led the 
conceptual design of the thing. 

A: That theory group also went ahead and 
looked at fusion bombs, surveying the 
territory that would eventually become a 
hydrogen bomb. That was put to the side 
until after the war. 


Q: Oppenheimer lost his security clearance 
in part because he opposed the develop- 
ment of the hydrogen bomb. He’s often 
portrayed as a tragic figure, too naive to 
defend himself politically. Do you think he : 
was tragic or naive? 

A: I don’t think he was naive, because 

he knew that he was vulnerable. And he 
knew that they would probably come after 
him as soon as he opposed this bomb. Of 
course, he was very disappointed. But, as 

a lot of authors have pointed out, he was 
not as tragic as some of the victims of Mc- 
Carthyism, like his own brother and sister- 
in-law. They were hounded, as were many 
of his students. Oppenheimer didn’t lose 
his job. He was not blacklisted. He wasn’t 
forced to emigrate. As [physicist and 
historian] Abraham Pais said, “All he lost 
was a sense of power, which he craved.” 

He was no longer an insider, but he was 
still a highly regarded cultural figure and 

a spokesman for American science. ® 
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Funding agencies say no to Al peer review 


Concerns include confidentiality, accuracy, and “originality of thought” 


By Jocelyn Kaiser 


euroscientist Greg Siegle was at a 

conference in early April when he 

heard something he found “very 

scary.” Another scientist was gush- 

ing that ChatGPT, the artificial 

intelligence (AI) tool released in 
November 2022, had quickly become indis- 
pensable for drafting critiques of the thick 
research proposals he had to wade through 
as a peer reviewer for the National Insti- 
tutes of Health (NIH). Other listeners nod- 
ded, saying they saw ChatGPT 
as a major time saver: Draft- 
ing a review might entail just 
pasting parts of a proposal, 
such as the abstract, aims, 
and research strategy, into the 
AI and asking it to evaluate 
the information. 

NIH and at least one other 
funding agency, however, are 
putting the kibosh on the 
approach. On 23 June, NIH 
banned the use of online gen- 
erative AI tools like ChatGPT 
“for analyzing and formulat- 
ing peer-review critiques’— 
likely spurred in part by a let- 
ter from Siegle, who is at the 
University of Pittsburgh, and 
colleagues. After the confer- 
ence they warned the agency 
that allowing ChatGPT to write grant re- 
views is “a dangerous precedent.” In a simi- 
lar move, the Australian Research Council 
(ARC) on 7 July banned generative AI for 
peer review after learning of reviews ap- 
parently written by ChatGPT. 

Other agencies are also developing a re- 
sponse. The U.S. National Science Founda- 
tion has formed an internal working group 
to look at whether there may be appropri- 
ate uses of AI as part of the merit review 
process, and if so what “guardrails” may be 
needed, a spokesperson says. And the Euro- 
pean Research Council expects to discuss AI 
for both writing and evaluating proposals. 

ChatGPT and other large language mod- 
els train on vast databases of information 
to generate text that appears to be writ- 
ten by humans. The bots have already 
prompted scientific publishers concerned 
about ethics and factual accuracy to re- 
strict their use for writing papers. Some 
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publishers and journals, including Science, 
are also banning their use by reviewers. 

For the funding agencies, confidentiality 
tops the list of concerns. When parts of a 
proposal are fed into an online AI tool, the 
information becomes part of its training 
data. NIH worries about “where data are 
being sent, saved, viewed, or used in the 
future,” its notice states. 

Critics also worry that Al-written re- 
views will be error-prone (the bots are 
known to fabricate), biased against non- 
mainstream views because they draw from 


existing information, and lack the creativ- 
ity that powers scientific innovation. “The 
originality of thought that NIH values is 
lost and homogenized with this process 
and may even constitute plagiarism,’ NIH 
officials wrote on a blog. For journals, re- 
viewer accountability is also a concern. 
“There’s no guarantee the [reviewer] un- 
derstands or agrees with the content” 
they’re providing, says Kim Eggleton, who 
heads peer review at IOP Publishing. 

In Australia, ARC banned grant review- 
ers from using generative AI tools 1 week 
after an anonymous Twitter account, ARC_ 
Tracker, run by a researcher there reported 
that scientists had received reviews that 
appeared to be written by ChatGPT. In one 
clue, some got similar appraisals when they 
pasted parts of their proposals into Chat- 
GPT, ARC_Tracker says. One review even 
included the words “regenerate response,” 
which appear as a prompt at the end of 


a ChatGPT response. (Science confirmed 
ARC_Tracker’s identity but agreed to the 
researcher’s request for anonymity so they 
and others can use the account to freely cri- 
tique ARC and government policies without 
fear of repercussions.) 

Scientists may think ChatGPT produces 
meaningful feedback, but it essentially 
regurgitates the proposal, ARC_Tracker’s 
owner says. Admittedly, some human re- 
viewers do that, too. But, “There is a very 
big difference between a proper review- 
which should provide insight, critique, 
informed opinion and expert 
assessment-and a mere sum- 
mary of what’s already in 
a proposal,” the researcher 
wrote in an email to Science. 

Some researchers, however, 
say AI offers a chance to im- 
prove the peer-review process. 
The NIH ban is a “techno- 
phobic retreat from the op- 
portunity for positive change,” 
says psychiatric geneticist Jake 
Michaelson of the University 
of Iowa. Reviewers could use 
the tools to check their cri- 
tique to see whether they’ve 
overlooked anything in the 
proposal, help them assess 
work from outside their own 
field, and smooth language 
they did not realize sounds 
“petty or even mean,” Michaelson says. 
“Eventually I see AI becoming the first line 
of the peer-review process, with human ex- 
perts supplementing first-line AI reviews. 


... I would rather have my own proposals . 


reviewed by ChatGPT-4 than a lazy human 
reviewer,” he adds. 

The landscape is likely to change over 
time. Several scientists noted on NIH’s blog 
that some generative AI models work offline 
and don’t violate confidentiality—eliminating 
at least that concern. NIH responded that it 
expects to “provide additional guidance” for a 
“rapidly evolving area.” 

Mohammad Hosseini, an ethics postdoc 
researcher at Northwestern University who 
has written about AI in manuscript review, 
agrees the NIH ban is reasonable, for now: 
“Given the sensitivity of issues and projects 
the NIH deals with, and the novelty of AI 
tools, adopting a cautious and measured ap- 
proach is absolutely necessary.” 
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thousands of African volunteers are taking action By Sandeep Ravindran 


magine joyfully announcing to your 
Facebook friends that your wife gave 
birth, and having Facebook auto- 
matically translate your words to “my 
prostitute gave birth.” Shamsuddeen 
Hassan Muhammad, a computer sci- 
ence Ph.D. student at the University 
of Porto, says that’s what happened 
to a friend when Facebook’s English 
translation mangled the nativity news he 
shared in his native language, Hausa. 
Such errors in artificial intelligence (AI) 
262 
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translation are common with African lan- 
guages. AI may be increasingly ubiquitous, 
but if you’re from the Global South, it prob- 
ably doesn’t speak your language. 

That means Google Translate isn’t much 
help, and speech recognition tools such as 
Siri or Alexa can’t understand you. All of 
these services rely on a field of AI known as 
natural language processing (NLP), which 
allows AI to “understand” a language. The 
overwhelming majority of the world’s 7000 
or so languages lack data, tools, or techniques 


for NLP, making them “low-resourced,’ in 
contrast with a handful of “high-resourced” 
languages such as English, French, German, 
Spanish, and Chinese. 

Hausa is the second most spoken African 
language, with an estimated 60 million to 
80 million speakers, and it’s just one of more 
than 2000 African languages that are mostly 
absent from AI research and products. The 
few products available don’t work as well as 
those for English, notes Graham Neubig, an 
NLP researcher at Carnegie Mellon Univer- 
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sity. “It’s not the people who speak the lan- 
guages making the technology.’ More often 
the technology simply doesn’t exist. “For ex- 
ample, now you cannot talk to Siri in Hausa, 
because there is no data set to train Siri,” 
Muhammad says. 

He is trying to fill that gap with a project 
he co-founded called HausaNLP, one of sev- 
eral launched within the past few years to 
develop AI tools for African languages. Many 
projects have their roots in Masakhane, a 
pan-African volunteer effort led primarily by 
African researchers and coders determined 
to create translation products that would let 
ordinary Africans reap the benefits of the 
internet—and better cope with its pitfalls. 
Muhammad, for example, hopes to use these 
tools to help fight hate speech on social media 
and decolonize science by making research 
papers more accessible in African languages. 

Similar projects have sprung up 
elsewhere across the Global South 
and among Indigenous communi- 
ties in New Zealand and the Ameri- 
cas, aiming to use AI to preserve 
and revitalize languages discarded 
or disregarded because of colonial- 
ism. The work hasn’t yet produced 
the equivalent of a Siri or Google 
Translate, but these efforts are de- 
veloping the data sets and software tools 
needed to build one. Jade Abbott, an NLP 
researcher and director at African startup Le- 
lapa AI and co-founder of Masakhane, says 
the broader goal is to help more people in the 
Global South join the global economy. “The 
world of the internet is not a place for our 
languages yet, and it needs to be,” she says. 


BETTER Al for African languages could em- 
power a huge number of people to access 
jobs and other opportunities that are now 
closed off to them, says Ignatius Ezeani, an 
NLP researcher at Lancaster University and 
a member of Masakhane, which has over 
2000 volunteers from more than 30 coun- 
tries. Ezeani says most Nigerians, including 
his parents, don’t speak English. As a result, 
they “struggle with their education, they 
struggle with the economy, with the agricul- 
ture, with the law, with health care, with di- 
saster response,’ Ezeani says. 

Founded in 2019 by Abbott and her col- 
league Laura Martinus, Masakhane, which 
means “we build together” in isiZulu, hopes 
to help non-English speakers overcome such 
struggles. For example, African startups are 
using Masakhane’s data to build AI transla- 
tion tools and chatbots to help people access 
financial services in their native languages. 
Such tools would also enable them to fol- 
low African news and government and legal 
communication—which in most countries 
currently exist primarily in English, French, 
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or Arabic. “We need to try to use all these 
tools, if possible, to correct the errors of colo- 
nization and dehumanization of the last few 
centuries,’ Ezeani says. 

Abbott has a similar goal for science. 
Masakhane’s Decolonize Science project, 
which Muhammad is also involved in, aims 
to develop machine translations of Afri- 
can preprint research papers released on 
AfricArXiv. The preprints are often in English 
or European languages, but the project plans 
to translate them into six diverse African 
languages: isiZulu, Northern Sotho, Yoruba, 
Hausa, Luganda, and Amharic, together spo- 
ken by about 140 million people. 

To create these tools, Masakhane can’t just 
copy what Google or Meta does—throwing 
massive amounts of data and computing 
power at the complex task of understand- 
ing a language. NLP works by breaking the 


task down into many smaller steps that 
machine learning algorithms can solve in- 
dividually, by recognizing patterns in the 
text. One algorithm might split a paragraph 
of text into separate sentences. Another 
would then deconstruct each sentence into 
individual words. Additional models try to 
analyze each word separately to figure out 
whether it’s a noun, verb, or some other 
part of speech, and how different words in 
the sentence relate to each other. 

Many current AI models learn to do all 
this by training on immense amounts of text 
data. “Google has basically scanned virtually 
every piece of human literature in the world, 
and so they have this huge data set,’ says 
Michael Running Wolf, a software engineer 
and AI ethicist who founded Indigenous in 
AI. For high-resource languages such as Eng- 
lish, those data can come in large part from 
web crawler programs that vacuum up all the 
text on the internet. African languages, how- 
ever, are virtually absent from the internet. 
“Tt’s not a purely technical problem, it’s a so- 
cietal problem,” Abbott says. Under colonial- 
ism, Africans were heavily discouraged from 
using their native languages, particularly in 
writing. “People were taught to feel ashamed 
for their own language,” she says. That leaves 
little written text for AI translation models 
to train on, let alone annotated speech for 
speech-to-text or voice recognition. 

The data scarcity isn’t necessarily an insur- 
mountable problem, Abbott says. Masakhane 


lacks Silicon Valley’s vast computational re- 
sources and cutting-edge software tools. It 
has to make do with older models that run 
on simpler hardware. And, she points out, “If 
you don’t have as much data, there’s no point 
having that big a model anyway, it’s not going 
to give you any advantage.” 

Masakhane and a project for the Maori 
language called Papa Reo have found that a 
bit of data can go a long way. Papa Reo, for ex- 
ample, created an AI model using 300 hours 
of audio it collected by holding a competition 
to encourage people across New Zealand to 
record themselves speaking specific phrases 
in the Maori language, te reo Maori. 

Masakhane has also developed _tech- 
niques to create more data-efficient language 
models. In a recent paper, David Adelani, a 
Masakhane member and NLP researcher at 
University College London, and colleagues 
showed that instead of the 100,000 
or 1 million sentences typically used 
to train NLP systems for high re- 
source languages, existing models 
trained on large data sets to work 
with multiple languages could 
be fine-tuned to work with just 
2000 sentences. The examples were 
drawn from high-quality transla- 
tions of African news in 16 lan- 
guages, including eight the model had never 
been exposed to before. That’s a hopeful sign 
that existing models can be adapted for low- 
resource languages, Adelani says. 

But even if these approaches require fewer 
data, Masakhane still has to collect those 
data from scratch. They've developed a par- 
ticipatory research process to create data sets 
based on community input, whether from 
news articles or from discussions with volun- 
teers. “Most people are volunteering and do- 
ing that for the love of their language, for the 
survival of their language,” says Adelani, who 
has contributed to participatory research 
projects by Masakhane. 

For example, when working with Khoek- 


hoegowab, a very low resource language . 


from Namibia, the consortium held an 
8-day workshop with native speakers. The 
researchers started discussions with com- 
munity participants using seed words in 
relevant topics, and jointly workshopped 
them into a list of sentences that commu- 
nity members deemed to be a natural use of 
those words. These efforts gathered smaller 
quantities of high-quality data directly rel- 
evant to training translation models, in 
contrast to the immense quantities of low- 
quality data big tech companies harvest 
from the internet. 

“Tt really kind of changes the entire nature 
of the way we see data,’ Abbott says. “Instead 
of a thing that’s scraped and extracted, it’s 
this beautiful area of creation,’ she says. 
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A challenge for machine translation 
As of 2022, there were more than 2000 living languages in Africa. With 520 languages, Nigeria accounted 
for about one-fourth of the total. For most African languages, data and software tools for natural language 


processing are scarce. 


Nigeria: 520 
living languages 


Number of living languages 
@ 273-520 

@ 123-273 

@ 55-123 

@ 20-55 

® 1-20 


Official language 

@ Arabic 

O English 

A French 

A Portuguese/Spanish 
@ Other or no data 


THE MASAKHANE RESEARCHERS are also cre- 
ating databases that address more specific 
needs in NLP. There’s probably nothing more 
personal than ensuring that an AI under- 
stands your name and where you live. And yet 
when it comes to African languages, most AI 
translation tools struggle with named entity 
recognition (NER), the process of identifying 
proper names—such as a person, location, 
or organization. 

To rectify this, Adelani, Abbott, and their 
colleagues helped create MasakhaNER, the 
first large-scale African language data set 
for NER. They annotated thousands of sen- 
tences from local news articles in 20 lan- 
guages, flagging proper names by hand, to 
create a data set to train AI models to detect 
and categorize named entities in those lan- 
guages. “NER is actually far bigger than just 
being an NLP task, it is about technology 
understanding and acknowledging you as a 
person,” Abbott says. 

She and her colleagues are also building 
language data sets for accurate sentiment 
analysis, which allows AI to understand 
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the emotions of a particular text. When 
Muhammad started his Ph.D. in 2018, NLP 
researchers relied on translations of Eng- 
lish sentiment analysis data. “This data 
set does not represent people in our com- 
munity in Nigeria, the culture, the values, 
the knowledge,’ Muhammad says. He adds 
that translation can alter sentiment, such 
as when “my wife gave birth” becomes “my 
prostitute gave birth.” 

Through HausaNLP, Muhammad created 
an African language data set for sentiment 
analysis for the four most widely spoken 
Nigerian languages—Hausa, Igbo, Nigerian- 
Pidgin, and Yoruba. Volunteers helped him 
manually annotate about 30,000 tweets in 
each language with their corresponding senti- 
ment, creating a training data set for AI mod- 
els to detect sentiments in these languages. 
Lately, he has been focusing on one particu- 
lar sentiment—hate—and trying to use senti- 
ment analysis to automatically detect African 
language hate speech on social media. 

Twitter, for example, has the ability to au- 
tomatically block offensive tweets in English, 


but it has no such function for African lan- 
guages. “There isn’t even a data set to train a 
model to be able to understand whether this 
is hate or not hate,’ Muhammad says. Hate 
speech in African languages has to be manu- 
ally swatted down. That is nowhere as effec- 
tive as automated blocking, and aggrieved 
users often have to actively retweet a hateful 
tweet in order for it to get flagged and taken 
down by Twitter. Muhammad hopes his data 
set can help make online spaces safer for Af- 
rican language speakers. 


BUILDING AFRICAN language data sets has 
been essential, but it’s only one step toward 
making NLP work for African languages. 
Masakhane has also had to develop bespoke 
NLP tools. 

Back in 2019, most NLP tools were built 
for English and a few other languages that 
are structured very differently from Afri- . 
can languages. For example, tools that “to- 
kenize” or separate English sentences into 
individual words don’t work well for many 
African languages. That’s particularly true 
for African languages such as isiZulu that 
are agglutinative—their words are made by 
combining shorter words in a way that’s 
hard for English-trained AI to parse. Many 
African languages also include diacritics— 
marks such as a dot or an accent that guide 
pronunciation—making it harder to adapt 
English-trained AI to understand them. 
Some European languages do share these 
features, but so far only preliminary efforts 
have been made to adapt, say, German- 
trained AI to African languages. 

Masakhane has made steady progress 
in developing tools to understand African 
languages, and in learning how best to ap- 
ply models trained on other languages. For 
example, Adelani found that models trained 
on English work poorly on Yoruba compared 
with models transferred from certain other 
African languages. “The better the language 
similarity, the better is the transfer for any ° 
task,” he says. 

Adelani hopes to identify a small number 
of African languages that could be used to 
train versatile AI models that can work with 
many additional languages. “If you’re able to 
identify these good donor languages that are 
easy to transfer to others, then basically even 
though we have 2000 languages, we might be 
able to do great things with maybe 20 lan- 
guages,” he says. 


Al RESEARCHERS in the Global South and 
among Indigenous communities are wary 
of big tech companies harvesting their data 
to train proprietary AI models. “Data is the 
new oil,” Running Wolf says. “And so there’s 
sort of this very colonial perspective of, this 
is a land grab,” he says. 
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As a result, many Indigenous communi- 
ties are crafting protective licensing rules for 
the data they collect and the AI tools they 
develop. The New Zealand-based Papa Reo 
project uses a data license stipulating that 
any projects that use Maori data must respect 
Maori values and pass on any benefits to 
them. Similarly, the CARE Principles for In- 
digenous Data Governance developed by the 
Global Indigenous Data Alliance are aimed 
at ensuring that Indigenous communities 
worldwide maintain sovereignty over their 
data and ensure that they are used according 
to their principles and for their benefit. 

But Masakhane, like some other projects, 
has so far kept its data sets and models open 
source. Some project leaders say it hasn’t 
been an easy decision and remains a topic of 
discussion. Given the long history of exploita- 
tion of Indigenous and Global South commu- 
nities and the continuing power imbalances 
between North and South, the potential mis- 
use of data is a real concern. But for now, 
Masakhane has decided that the benefits of 
data sharing—such as making it easier for 
big tech companies to work on their native 
languages—outweigh the risks. 

Several African startups—among them 
GhanaNLP, Lesan AI, and one founded by 
Abbott called Lelapa Al—have begun to de- 
velop consumer tools from Masakhane’s data, 
such as apps and websites for text translation 
and speech recognition and transcription. 
“Ultimately what I’d love to see is ownership, 
in that these tools are owned by the commu- 
nities that speak the languages rather than 
by the West,” Abbott says. She envisions AI 
tools for native languages as a way to keep 
these languages alive. “Often you find people 


from the African continent who maybe left to 
go study abroad who now can’t even speak to 
their mom because they don’t speak the same 
language,” she says. 

Thanks to Masakhane’s open-source policy, 
its data are also spurring efforts by Google 
and Meta to tailor their tools for African lan- 
guages. “Data is a bottleneck in a lot of these 
NLP projects,” Neubig says. On their own, big 
tech companies have little incentive to work 
on low-resource languages, but providing the 
data, as Masakhane and other projects have 
done, can act as a catalyst, he says. Abbott 
agrees. “Google Translate has managed to get 
some [African languages] up to reasonable 
performance with a lot of effort and push 
from people who are actually part of Ma- 
sakhane,” she says. A Google spokesperson 
acknowledged that limited data has slowed 
the company’s efforts to develop tools for 
African languages, but said: “As our systems 
evolve and more data becomes available, we 
will continue to improve access and support 
for these languages in the future.” 

Rather than focusing on individual lan- 
guages, Google and Meta have built large 
multilingual models for many hundreds of 
languages. For example, Google’s model re- 
leased in May 2022 supports more than 1000 
languages and helped add 24 underresourced 
languages—including 10 African languages— 
to Google Translate. Meta’s machine transla- 
tion model released in July 2022 supports 
200 languages, including more than 50 dif- 
ferent African languages. And Meta’s more 
recent models from May can recognize and 
produce speech for more than 1000 languages. 

But Abbott says some of the big tech efforts 
have fallen short. Google’s own researchers 


In May, fellows of the Arewa Data Science Academy, a free training program for Nigerian youth who want to learn 
data science and machine learning, participated in an artificial intelligence hackathon in Nigeria. 
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reported that some of their 1000-language 
model’s translations of low-resource lan- 
guages were rated very poorly by native 
speakers. And Meta’s model for 200 lan- 
guages performed poorly on some of the Afri- 
can languages it claimed to translate, Abbott 
says. She worries the publicity around Meta’s 
model could hurt funding for the handful of 
small African AI startups that might do bet- 
ter. “To the rest of the world, it sounds like 
this is a solved problem because Facebook’s 
gone and created this big model,” Abbott says. 
A Meta spokesperson declined to comment. 

Some Masakhane researchers give big 
tech companies credit for helping the home- 
grown efforts by sharing their models and 
data sets and by funding some community- 
led NLP projects. For instance, multiple 
Masakhane projects and similar projects for 
underserved languages have received fund- 
ing from the Lacuna Fund, which began as a 
collaboration between the Rockefeller Foun- 
dation, Google, and Canada’s International 
Development Research Centre and has since 
expanded. And some Google researchers 
have been part of the Masakhane community 
on Slack and helped mentor volunteers, in- 
cluding Muhammad. 


MASAKHANE’S MOST lasting legacy may be 
its people. For many volunteers, Masakhane 
has been a stepping stone toward pursuing 
academic research on AI, and the project 
has created a pipeline of AI researchers who 
are native speakers of African languages. “If 
the people who train the model understand 
the language, they can pick up issues in data 
sets,” Abbott says. 

Volunteers have collectively published 
hundreds of scientific papers, including 
translation models for at least 38 African lan- 
guages, and presented several workshops at 
major AI conferences. That’s a change from 
2019, when Abbott says she was one of just 
three researchers from the entire continent of 
Africa among the thousands of attendees at a 
major computational linguistics conference. 
“The fact that it got so many people started 
and built this community of thousands of 
people is the real success story,’ Neubig says. 

No one thinks AI will suddenly undo the 
ravages of hundreds of years of colonialism. 
“The damage has been done, and it took a 
long time to do the damage,” Ezeani says. But 
Masakhane and similar projects are a posi- 
tive step toward reducing the dominance of a 
handful of mostly European languages in AI 
research. “Many machine translation mod- 
els that we developed were the first of their 
kind,’ Abbott says. “They exist because a per- 
son cared about that language.” & 
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Reliable earthquake precursors? 


Global Positioning System (GPS) measurements suggest 
hours-long precursors to many large earthquakes 


By Roland Biirgmann!2 


meaningful earthquake prediction 
must clearly define the expected time, 
location, and magnitude of a future 
event (7). Short-term earthquake pre- 
diction—that is, the capability to is- 
sue a warning from minutes to a few 
months before a mainshock—is impossible 
without the existence of an observable and 
actionable precursor. Thus, a key goal is the 
discovery of a common preparatory faulting 
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process that can tell us where, when, and 
how big an impending earthquake is going 
to be (2). On page 297 of this issue, Bletery 
and Nocquet (3) present a systematic 
analysis of changes in horizontal position 
of approximately 3000 geodetic stations, 
which were measured by using the Global 
Positioning System (GPS), near 90 global 
earthquakes with magnitudes greater than 
7. They found that on average, horizontal 
movements of the stations exponentially 
accelerated in a direction consistent with 


slow fault slip near the eventual earthquake 
nucleation point in the last 2 hours before 
the earthquake ruptures. 

There is a long history of retrospective . 
studies, carried out after a large earth- 
quake has already happened, that suggests 
a wide variety of possible precursors that 
could potentially have been used to predict 
the earthquake (2). In the 1970s, there was 
some optimism that observations of reli- 
able precursors should be possible with im- 
proved geophysical observations, but that 
optimism waned in subsequent decades. 
There is a single case of an officially issued 
prospective prediction of a large earth- 
quake, based largely on the observation of 
hundreds of small earthquakes, in the days 
preceding the 1975 Haicheng earthquake in 
China. Even then, luck apparently played 
a big role in the identification of these 
events as foreshocks (4). Nonetheless, many 
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earthquakes are preceded by foreshock se- 
quences (5), and a few of the largest earth- 
quakes of the 21st century, including the 
2011 magnitude 9 Tohoku-oki earthquake 
near the Japan Trench, followed slow fault- 
slip episodes of varying sizes and dura- 
tions (6). That is, the fault that eventually 
ruptured and produced earthquake shak- 
ing sometimes started moving much more 
slowly before the mainshock. 

In laboratory experiments (7) and in 
computer models of earthquake ruptures 
(8), such precursory activity is common. 
Recently, machine-learning methods were 
used to study acoustic signals emitted by 
laboratory faults and successfully predicted 
the time remaining before the next labora- 
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Tsunami waves hit the coast in Fukushima 
Prefecture, Japan, after a powerful earthquake 
under the North Pacific Ocean. 


tory quake (9). However, natural foreshocks 
cannot be distinguished from similar clus- 
ters of background seismicity (JO), and 
observed slow preslip events have not ap- 
peared different from geodetically mea- 
sured slow slip transients that occurred 
without being followed by a large earth- 
quake (17). Overall, earthquake precursors 
in nature appear to be quite common, but 
they apparently come in a variety of flavors 
(6) and have failed to reveal prognostic in- 
formation that could be used to produce a 
short-term earthquake prediction (72). 

Bletery and Nocquet calculated the sum 
of observed horizontal displacements of 
GPS stations in the direction predicted by 
fault slip at the point where each earth- 
quake rupture nucleated, for each 5-min 
increment during the 2 days before rupture. 
Slow fault slip near the eventual nucleation 
point would contribute to increasing values 
of this summed function. For each individ- 
ual earthquake, the signal remains subtle 
at best, but about half of all earthquakes 
studied exhibited acceleration in horizontal 
displacement of nearby geodetic stations 
in the 2 hours before the mainshocks. The 
inferred average moment (a measure of the 
size of a slip event) of these short-duration 
preslip episodes is equivalent to that of a 
magnitude 6.3 earthquake. 

The authors present several statistical 
tests to build support for the proposed 
short-term precursory signal. Only 0.3% 
of 100,000 applications of their analysis 
carried out for randomly chosen 48-hour 
time windows led to the identification of 
a similarly significant, apparent precur- 
sor signal, indicating the specificity of the 
measurement. Bletery and Nocquet argue 
that the short duration and exponential 
acceleration of the preslip signal make the 
precursory phase they discovered different 
from quite commonly observed, indepen- 
dent slow slip events. 

Although the results of Bletery and 
Nocquet suggest that there may indeed 
be an hours-long precursory phase, it is 
not clear whether such slow-slip accelera- 
tions are distinctly associated with large 
earthquakes or whether they could ever 
be measured for individual events with the 
accuracy needed to provide a useful warn- 
ing. It will be important to fully explore 
how often similar slow slip episodes occur 
as false starts, without being followed by 
earthquakes. There should also be similar 
analyses of foreshock activity. This will al- 
low evaluation of whether the last-hour 
slow-slip accelerations can be related to 


more enduring foreshock activity, which 
may last weeks to months (5). Where they 
exist, data from other precise geodetic sys- 
tems (such as strainmeters, inclinometers, 
and ocean-bottom pressure sensors) should 
be reviewed to independently assess the 
proposed short-term preslip episodes (13). 
Machine-learning methodologies may be 
valuable tools to optimally explore these 
complementary seismic and geodetic data. 
Most large earthquakes occur in subduction 
zones, which are largely under the oceans 
and thus quite distant from GPS monitor- 
ing networks. Improving the capability to 
properly detect offshore slow slip events 
will require installation of highly accurate 
and high-rate geodetic measurement sys- 
tems on the seafloor (14). 

The approach of Bletery and Nocquet re- 
quires full knowledge of the location and 
geometry of the mainshocks. Even if it will . 
eventually be possible to detect such pre- 
slip events without that information, the 
short warning window ahead of an immi- 
nent earthquake would limit the actions 
that could be taken to mitigate the impact 
on people. However, this could possibly be ‘ 
integrated into automated earthquake early- 
warning systems, which already provide 
seconds to minutes of warning of shaking 
to come in some parts of the world, before 
seismic waves arrive from an earthquake * 
that has already started (15). If it can be con- 
firmed that earthquake nucleation often in- 
volves an hours-long precursory phase, and 
the means can be developed to reliably mea- 
sure it, a precursor warning could be issued, 
letting people know that it is time to let go of 
sharp utensils and get ready to “Drop, Cover, 
and Hold On,” before the Big One strikes. & 
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Neural implants without brain surgery 


Injectable bioprobes record single-neuron activity from within blood vessels 


By Brian P. Timko 


rain-machine interfaces (BMIs) en- 

able direct electrical communica- 

tion between the brain and external 

systems. They allow brain activity to 

control devices such as prostheses 

and computer programs, or to modu- 
late nerve or muscle function to compensate 
for dysfunctional endogenous pathways. 
Collectively, BMIs have the potential to help 
individuals with paralysis or neurological 
disorders to regain function (1). However, 
recording from deep-brain regions currently 
requires surgery to implant probes, so less in- 
vasive methods for interfacing bioelectronic 
devices with neurons are required. On page 
306 of this issue, Zhang et al. (2) present a 
brain-surgery-free method to probe neural 
function in the rat brain. They achieved this 
by deploying a bioelectronic recording device 
from an endovascular catheter, using the 
brain’s vasculature as a natural delivery sys- 
tem. The technology could enable long-term, 
minimally invasive bioelectronic interfaces 
with deep-brain regions. 

Conventional BMIs use detection methods 
such as electroencephalography and elec- 
trocorticography, which measure local field 
potentials from ensembles of neurons at the 
surface of the scalp or on the dura mater (a 
meningeal layer that covers the brain) (3), as 
well as intracortical probes that can measure 
single-neuron activity from deeper regions. 
However, intracortical probes require crani- 
otomy and cause mechanical disruptions to 
the brain tissue. These probes also induce 
inflammation and fibrosis, which degrade 
device performance within weeks (4). These 
deleterious effects can be attributed to the 
large mismatch in mechanical stiffness be- 
tween the implant and brain tissue. Advances 
in materials science have addressed this issue 
with biomaterials (5), including organic elec- 
tronics (6), with mechanical properties tuned 
to match those of the human brain. In animal 
models, these devices integrated with brain 
tissues and caused minimal inflammation, 
even over the long term. A parallel approach 
is to engineer the geometry of the devices so 
that they are flexible and stretchable. For ex- 
ample, mesh bioelectronics were injected di- 
rectly into a living mouse brain and obtained 
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stable single-neuron recordings for up to 8 
months, with minimal inflammation (7). 

Nevertheless, delivering devices into the 
brain remains a challenge. Any surgery that 
penetrates the blood-brain barrier poses a 
risk for infection, so less invasive methods 
to deliver devices into deep-brain regions are 
crucial. The vascular system is a potential de- 
livery route because it mirrors the structure 
of the neuronal networks that it supports (8) 
and most neurons are within 10 to 20 um of a 
capillary (9). The vascular network can be ac- 
cessed through an incision in locations such 
as the jugular vein or carotid artery, which 
are used by neurosurgeons to implant self- 
expanding stents in the brain to treat con- 
ditions such as cerebral atherosclerosis (10). 
Stents have also been combined with elec- 
trodes (stent-electrode recording arrays) to 
record cortical neural activity from veins as 
narrow as 1.7 mm in diameter for up to 190 
days (11). These BMIs enabled four patients 
paralyzed by lateral sclerosis to perform sim- 
ple computer tasks by thought (72). 

To deliver bioelectronics to regions of the 
brain with narrower, less accessible blood 
vessels, Zhang et al. designed a mesh-like 


recording device that was much smaller and 
more flexible than those used previously (see 
the figure). This device, which contained 16 
distinct recording elements, was loaded into 
an endovascular catheter. Using a rat model, 
they made an incision in the neck and guided 
the catheter into the internal carotid artery 
(ICA). When the device was expelled from the 
catheter, it expanded like a stent to record 
neuronal signals across the vascular wall. 
Because the device was so flexible, it could be 
deployed to previously inaccessible branches 
of the ICA with vessel diameters <100 um. 
This capability enabled Zhang et al. to re- 
cord distinct firing patterns from the middle 
cerebral artery (MCA) and anterior cerebral 
artery (ACA), which overlay the cortex and 
olfactory bulb, respectively. Despite the fra- 
gility of these small vessels, the implanted 
devices caused no substantial change to cere- 
bral blood flow, rat behavior, or the structure 
of the blood-brain barrier, and did not elicit 
an immune response. 

Because the device is so small, it was able 
to record not only local field potentials, as 
observed with the stent-electrode recording 
arrays, but also single-neuron activity. This 


Recording of brain activity across blood vessel walls 

A catheter containing the mesh recording device was inserted through an incision in the neck of a rat and 
guided along the internal carotid artery (ICA) to the point where it splits into the middle cerebral artery (MCA) 
and anterior cerebral artery (ACA). The device was then released into the MCA or the ACA and expanded to 


record neuronal activity across the blood vessel wall. 
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ability to achieve noninvasive, single-neuron 
recordings is important for studies of deep- 
brain regions such as the medial temporal 
lobe where activity is not spatially clustered 
and therefore only identifiable at the single- 
neuron level. Future studies could answer 
long-standing questions about how memo- 
ries are stored and retrieved (73). 

Future BMIs could provide tailored thera- 
pies to the patient by recording and decoding 
their neural activity and then providing the 
appropriate modulatory stimuli (7). These bi- 
directional systems are especially relevant for 
advanced prosthetics, where the BMI enables 
both motor control and tactile feedback. 
Therefore, the next versions of endothelial 
probes should incorporate localized stimula- 
tion devices, as was demonstrated with the 
brain-injected meshes (7). These stimulation 
elements might also be used to electroporate 
the blood vessel wall, enabling localized drug 
delivery across the blood-brain barrier. 

Although the probes designed by Zhang 
et al. can enter vessels such as the MCA and 
ACA, smaller versions could reach capillaries, 
which have diameters <10 um. These smaller 
probes could be achieved by using nano- 
scale recording and stimulation elements. 
Nanoelectronics are also advantageous be- 
cause they can be small enough to enter the 
cytosol, which would enable intracellular 
interrogations of blood vessel walls (74). The 
probes only have branch selectivity at the 
first bifurcation; future probes might contain 
an onboard guidance system, for example, 
magnetic particles that can be manipulated 
with an external field, allowing them to travel 
beyond the catheter while remaining under 
directional control. Such systems might ob- 
viate the need for a catheter altogether, al- 
lowing the devices to be injected through a 
standard needle and at other locations of the 
body, such as the arm. These endovascular 
probes might form the foundation for ma- 
chine interfaces throughout the body. 
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ELECTROCHEMISTRY 


Electrochemical 


waste-heat harvesting 


A combined thermal and electrochemical device 
enhances voltage and hydrogen production 


By Boyang Yu and Jiangjiang Duan 


arvesting waste heat (e.g., solar irra- 
diation, or from industrial processes 
or the human body) is crucial for car- 
bon neutrality and sustainable de- 
velopment (7). Unfortunately, most 
waste heat is distributed near ambi- 
ent temperature, making it inaccessible to 
conventional heat engines, which require 
large temperature differences (2). On the ba- 
sis of redox reactions at two electrodes with 
different temperatures, an electrochemical 
device called a thermogalvanic cell can be 
used for continuous waste-heat harvesting, 
with affordable, scalable, and eco-friendly 
characteristics (3). Nevertheless, the lim- 
ited heat-to-electricity conversion efficiency 
is a critical challenge for practical applica- 
tions. On page 291 of this issue, Wang et al. 
(4) report a photocatalytically enhanced 
thermogalvanic cell that combines in situ 
heat-to-electricity conversion with water 
splitting, boosting both electricity and hy- 
drogen production. This approach promises 
to improve harnessing of solar energy and 
other sources of waste heat. 
Thermogalvanic cells have the basic con- 
figuration of two electrodes sandwiching an 
electrolyte, where the electrolyte contains 
a redox couple such as ferro/ferricyanide 
anions [Fe(CN),*/Fe(CN),*-] (1). To realize 
heat-to-electricity conversion, a heat flux 
input is first required to establish a temper- 
ature difference (A7) between the two elec- 
trodes, which can be obtained from diverse 
sources of waste heat. Then, the oxidation 
of Fe(CN),* to Fe(CN),*, accompanied with 
more entropy, is thermodynamically fa- 
vorable and injects electrons into the hot 
electrode, whereas the reduction reaction 
attracts electrons from the cold electrode, 
generating a thermal voltage (AV) (see the 
figure). 
The thermopower of the thermogalvanic 
cell (S.), quantified as AV/AT, is the key to 
both high cell voltage and heat-to-electricity 
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conversion efficiency (5). One promising 
way to improve S, is to build a concentra- 
tion gradient of redox ions (AC), which has 
been proven to boost efficiency of a single 
cell (6, 7). For example, a locally high con- 
centration of Fe(CN),* on the hot side and 
Fe(CN) ha on the cold side is thermodynami- 
cally favorable for a large AV (namely S.). 
However, current strategies are only avail- 
able to create single-ion concentration gra- 
dients, which limit the improvement of S.. 
How to achieve a large S, remains a chal- 
lenge, owing to the lack of strategies for cre- 
ating multiple-ion concentration gradients. 
To this end, Wang et al. propose a strategy 
that employs photocatalysts to facilitate the 
conversion and accumulation of redox ions 
and hence the formation of their concentra- 
tion gradients. In photocatalysis, photogen- 
erated electrons with sufficient energy are 
excited from the valence band maximum 
(VBM) to the conduction band minimum 
(CBM), whereas holes are generated on the 
VBM. By matching the CBM-VBM gap and 
the redox potentials, the photogenerated 
electrons and holes can promote the reduc- 
tion and oxidization reactions, respectively 
(8). A typical example is solar water split- 
ting for hydrogen (H,) production. In the 
study of Wang et al., two types of photocata- 
lysts were carefully chosen to couple oxygen 
(O,) generation with Fe(CN),* reduction 
and H, generation with Fe(CN),* oxidation. 
By fixing the photocatalysts to the oppo- 
site sides of the thermogalvanic cells (see 
the figure), the photocatalytic Fe(CN),* re- 
duction and Fe(CN),* oxidation lead to a 
locally high concentration of Fe(CN),* at 
the hot anode and Fe(CN) ae at the cold 
cathode, respectively. The generation of AC 
for both Fe(CN),* and Fe(CN),* tripled S, 
to 8.2 mV K", enabling a record-high ther- 
mogalvanic performance of 8.5 mW m? K® 
near ambient temperature (average ~33°C). 
Furthermore, Wang et al. established a uni- 
versal linear relationship between S, and 
the H, generation rate, suggesting that bet- 
ter photocatalysts could enable higher S.. 
The approach of Wang et al. provides es- 
sential design principles for photocatalyti- 
cally enhanced thermogalvanic systems. 
Using a rational design of photocatalysts 
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ability to achieve noninvasive, single-neuron 
recordings is important for studies of deep- 
brain regions such as the medial temporal 
lobe where activity is not spatially clustered 
and therefore only identifiable at the single- 
neuron level. Future studies could answer 
long-standing questions about how memo- 
ries are stored and retrieved (73). 

Future BMIs could provide tailored thera- 
pies to the patient by recording and decoding 
their neural activity and then providing the 
appropriate modulatory stimuli (7). These bi- 
directional systems are especially relevant for 
advanced prosthetics, where the BMI enables 
both motor control and tactile feedback. 
Therefore, the next versions of endothelial 
probes should incorporate localized stimula- 
tion devices, as was demonstrated with the 
brain-injected meshes (7). These stimulation 
elements might also be used to electroporate 
the blood vessel wall, enabling localized drug 
delivery across the blood-brain barrier. 

Although the probes designed by Zhang 
et al. can enter vessels such as the MCA and 
ACA, smaller versions could reach capillaries, 
which have diameters <10 um. These smaller 
probes could be achieved by using nano- 
scale recording and stimulation elements. 
Nanoelectronics are also advantageous be- 
cause they can be small enough to enter the 
cytosol, which would enable intracellular 
interrogations of blood vessel walls (74). The 
probes only have branch selectivity at the 
first bifurcation; future probes might contain 
an onboard guidance system, for example, 
magnetic particles that can be manipulated 
with an external field, allowing them to travel 
beyond the catheter while remaining under 
directional control. Such systems might ob- 
viate the need for a catheter altogether, al- 
lowing the devices to be injected through a 
standard needle and at other locations of the 
body, such as the arm. These endovascular 
probes might form the foundation for ma- 
chine interfaces throughout the body. 
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arvesting waste heat (e.g., solar irra- 
diation, or from industrial processes 
or the human body) is crucial for car- 
bon neutrality and sustainable de- 
velopment (7). Unfortunately, most 
waste heat is distributed near ambi- 
ent temperature, making it inaccessible to 
conventional heat engines, which require 
large temperature differences (2). On the ba- 
sis of redox reactions at two electrodes with 
different temperatures, an electrochemical 
device called a thermogalvanic cell can be 
used for continuous waste-heat harvesting, 
with affordable, scalable, and eco-friendly 
characteristics (3). Nevertheless, the lim- 
ited heat-to-electricity conversion efficiency 
is a critical challenge for practical applica- 
tions. On page 291 of this issue, Wang et al. 
(4) report a photocatalytically enhanced 
thermogalvanic cell that combines in situ 
heat-to-electricity conversion with water 
splitting, boosting both electricity and hy- 
drogen production. This approach promises 
to improve harnessing of solar energy and 
other sources of waste heat. 
Thermogalvanic cells have the basic con- 
figuration of two electrodes sandwiching an 
electrolyte, where the electrolyte contains 
a redox couple such as ferro/ferricyanide 
anions [Fe(CN),*/Fe(CN),*-] (1). To realize 
heat-to-electricity conversion, a heat flux 
input is first required to establish a temper- 
ature difference (A7) between the two elec- 
trodes, which can be obtained from diverse 
sources of waste heat. Then, the oxidation 
of Fe(CN),* to Fe(CN),*, accompanied with 
more entropy, is thermodynamically fa- 
vorable and injects electrons into the hot 
electrode, whereas the reduction reaction 
attracts electrons from the cold electrode, 
generating a thermal voltage (AV) (see the 
figure). 
The thermopower of the thermogalvanic 
cell (S.), quantified as AV/AT, is the key to 
both high cell voltage and heat-to-electricity 
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conversion efficiency (5). One promising 
way to improve S, is to build a concentra- 
tion gradient of redox ions (AC), which has 
been proven to boost efficiency of a single 
cell (6, 7). For example, a locally high con- 
centration of Fe(CN),* on the hot side and 
Fe(CN) ha on the cold side is thermodynami- 
cally favorable for a large AV (namely S.). 
However, current strategies are only avail- 
able to create single-ion concentration gra- 
dients, which limit the improvement of S.. 
How to achieve a large S, remains a chal- 
lenge, owing to the lack of strategies for cre- 
ating multiple-ion concentration gradients. 
To this end, Wang et al. propose a strategy 
that employs photocatalysts to facilitate the 
conversion and accumulation of redox ions 
and hence the formation of their concentra- 
tion gradients. In photocatalysis, photogen- 
erated electrons with sufficient energy are 
excited from the valence band maximum 
(VBM) to the conduction band minimum 
(CBM), whereas holes are generated on the 
VBM. By matching the CBM-VBM gap and 
the redox potentials, the photogenerated 
electrons and holes can promote the reduc- 
tion and oxidization reactions, respectively 
(8). A typical example is solar water split- 
ting for hydrogen (H,) production. In the 
study of Wang et al., two types of photocata- 
lysts were carefully chosen to couple oxygen 
(O,) generation with Fe(CN),* reduction 
and H, generation with Fe(CN),* oxidation. 
By fixing the photocatalysts to the oppo- 
site sides of the thermogalvanic cells (see 
the figure), the photocatalytic Fe(CN),* re- 
duction and Fe(CN),* oxidation lead to a 
locally high concentration of Fe(CN),* at 
the hot anode and Fe(CN) ae at the cold 
cathode, respectively. The generation of AC 
for both Fe(CN),* and Fe(CN),* tripled S, 
to 8.2 mV K", enabling a record-high ther- 
mogalvanic performance of 8.5 mW m? K® 
near ambient temperature (average ~33°C). 
Furthermore, Wang et al. established a uni- 
versal linear relationship between S, and 
the H, generation rate, suggesting that bet- 
ter photocatalysts could enable higher S.. 
The approach of Wang et al. provides es- 
sential design principles for photocatalyti- 
cally enhanced thermogalvanic systems. 
Using a rational design of photocatalysts 
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Simultaneous waste-heat harvesting and hydrogen generation 
(Left) Thermogalvanic cells enable electrochemical conversion of heat into electricity through a reversible 
reaction between a redox couple at two electrodes with different temperatures and small concentration 
gradients (AC). (Right) By combining in situ photocatalysts for water splitting, a large and continuous AC 
can be induced for the redox couple, which in turn boosts power generation. Concomitantly, hydrogen fuel 


production is enhanced by electrochemical heat harvesting. 
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and combining diverse solar fuel produc- 
tion processes beyond H, are likely to im- 
prove the thermogalvanic performance for 
a library of redox couples. In addition, by 
extracting synergistic ionic thermodiffu- 
sion (9), such photocatalytically enhanced 
thermogalvanic cells are likely to further 
exhibit higher S,. Another potential use of 
such a photocatalytically enhanced mech- 
anism is to expand the type of thermo- 
galvanic cells for large-scale integration, 
which is also available by the engineering 
of AC (10, 11). 

The work of Wang et al. also pioneers 
an exciting new route that uses waste 
heat to improve photoelectrochemical sys- 
tems. Notably, only a portion of the solar 
spectrum is available for photocatalytic 
processes, whereas most wavelengths’ ra- 
diation is lost as heat. Although thermo- 
chemical reactions and solar thermoelectric 
generators have been developed to assist 
solar-generated fuels synthesis, their high 
operating temperature (hundreds of kel- 
vins) necessitates extra solar concentrators 
(8). By contrast, thermogalvanic conver- 
sion by redox couples can directly exploit 
near-ambient temperature solar heat in 
photo-electrochemical systems, which are 
also easily integrated in situ. Indeed, Wang 
et al. found that the introduction of ther- 
mogalvanic conversion more than triples 
the H,/O, generation rate, under 1 sun (1 
sun = 1000 W m7”). The underlying reason 
is that the thermopower-generated electric 
field can reduce the free-energy barrier of 
the H,/O, generation processes. Moreover, 
the in situ thermogalvanic conversion has 
the potential to enable not only solar heat 
but also ubiquitous ambient heat energy to 
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enhance photoelectrochemical systems for 
electricity and fuel production. 

Unlike those of conventional thermo- 
electric technologies, the core materials of 
thermogalvanic conversion, redox couples, 
are also used in a broad area of energy 
chemistry such as photocatalysis (8), dye- 
sensitized solar cells (12), redox flow bat- 
teries (13), and electrochemical refrigera- 
tors (14), laying the foundations for in situ 
waste-heat harvesting and more efficient 
energy conversion. The work of Wang et al. 
marks a first step in this direction. Greater 
attention to such multifield coupling energy 
conversion should lead to the exploration of 
new synergistic mechanisms, and the per- 
formance of fundamental materials will be 
extensively researched, together contribut- 
ing to a more sustainable energy future. 
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triggers immune-mediated 
loss of pineal gland 
melatonin release 


By Harvey Davis and David Attwell 


he philosopher Descartes suggested 
that the pineal gland was the seat of 
the soul (7). Modern research has re- 
vealed that it plays a key role in set- 
ting the daily sleep-wake cycle (circa- 
dian rhythm). Melatonin, a hormone 
released from the pineal gland, is crucial 
for the maintenance of a healthy circadian 
rhythm, and its highest level in the blood 
is normally observed during darkness (2). 
However, a reduction in nocturnal melato- 
nin levels has been observed in patients and 
animal models with cardiac disease (3) and 
could be responsible for associated sleep 
disorders, including difficulty in initiating 
and maintaining sleep. On page 285 of this 
issue, Ziegler et al. (4) describe a pathway 
by which cardiac disease leads to immune- 
mediated sympathetic denervation of the 
pineal gland and a subsequent decrease in 
circulating melatonin, causing sleep disrup- 
tion. This provides a link between cardiac 
disease and sleep disorders and identifies 
an additional connection between the im- 
mune system and sympathetic function. 

A reduction in the mass of the pineal 
gland would reduce its capacity to release 
melatonin. However, Ziegler et al. found 
no change in pineal mass or cellular com- 
position either in mouse models of cardiac 
disease or in autopsy samples from pa- 
tients. Synthesis and release of melatonin 
by the pineal gland is driven by noradrena- 
line that is released by the sympathetic 
nervous system and acts on B-adrenocep- 
tors. Ziegler et al. observed fewer sympa- 
thetic axons innervating the pineal gland 
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Simultaneous waste-heat harvesting and hydrogen generation 
(Left) Thermogalvanic cells enable electrochemical conversion of heat into electricity through a reversible 
reaction between a redox couple at two electrodes with different temperatures and small concentration 
gradients (AC). (Right) By combining in situ photocatalysts for water splitting, a large and continuous AC 
can be induced for the redox couple, which in turn boosts power generation. Concomitantly, hydrogen fuel 
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and combining diverse solar fuel produc- 
tion processes beyond H, are likely to im- 
prove the thermogalvanic performance for 
a library of redox couples. In addition, by 
extracting synergistic ionic thermodiffu- 
sion (9), such photocatalytically enhanced 
thermogalvanic cells are likely to further 
exhibit higher S,. Another potential use of 
such a photocatalytically enhanced mech- 
anism is to expand the type of thermo- 
galvanic cells for large-scale integration, 
which is also available by the engineering 
of AC (10, 11). 

The work of Wang et al. also pioneers 
an exciting new route that uses waste 
heat to improve photoelectrochemical sys- 
tems. Notably, only a portion of the solar 
spectrum is available for photocatalytic 
processes, whereas most wavelengths’ ra- 
diation is lost as heat. Although thermo- 
chemical reactions and solar thermoelectric 
generators have been developed to assist 
solar-generated fuels synthesis, their high 
operating temperature (hundreds of kel- 
vins) necessitates extra solar concentrators 
(8). By contrast, thermogalvanic conver- 
sion by redox couples can directly exploit 
near-ambient temperature solar heat in 
photo-electrochemical systems, which are 
also easily integrated in situ. Indeed, Wang 
et al. found that the introduction of ther- 
mogalvanic conversion more than triples 
the H,/O, generation rate, under 1 sun (1 
sun = 1000 W m7”). The underlying reason 
is that the thermopower-generated electric 
field can reduce the free-energy barrier of 
the H,/O, generation processes. Moreover, 
the in situ thermogalvanic conversion has 
the potential to enable not only solar heat 
but also ubiquitous ambient heat energy to 
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enhance photoelectrochemical systems for 
electricity and fuel production. 

Unlike those of conventional thermo- 
electric technologies, the core materials of 
thermogalvanic conversion, redox couples, 
are also used in a broad area of energy 
chemistry such as photocatalysis (8), dye- 
sensitized solar cells (12), redox flow bat- 
teries (13), and electrochemical refrigera- 
tors (14), laying the foundations for in situ 
waste-heat harvesting and more efficient 
energy conversion. The work of Wang et al. 
marks a first step in this direction. Greater 
attention to such multifield coupling energy 
conversion should lead to the exploration of 
new synergistic mechanisms, and the per- 
formance of fundamental materials will be 
extensively researched, together contribut- 
ing to a more sustainable energy future. 
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he philosopher Descartes suggested 
that the pineal gland was the seat of 
the soul (7). Modern research has re- 
vealed that it plays a key role in set- 
ting the daily sleep-wake cycle (circa- 
dian rhythm). Melatonin, a hormone 
released from the pineal gland, is crucial 
for the maintenance of a healthy circadian 
rhythm, and its highest level in the blood 
is normally observed during darkness (2). 
However, a reduction in nocturnal melato- 
nin levels has been observed in patients and 
animal models with cardiac disease (3) and 
could be responsible for associated sleep 
disorders, including difficulty in initiating 
and maintaining sleep. On page 285 of this 
issue, Ziegler et al. (4) describe a pathway 
by which cardiac disease leads to immune- 
mediated sympathetic denervation of the 
pineal gland and a subsequent decrease in 
circulating melatonin, causing sleep disrup- 
tion. This provides a link between cardiac 
disease and sleep disorders and identifies 
an additional connection between the im- 
mune system and sympathetic function. 

A reduction in the mass of the pineal 
gland would reduce its capacity to release 
melatonin. However, Ziegler et al. found 
no change in pineal mass or cellular com- 
position either in mouse models of cardiac 
disease or in autopsy samples from pa- 
tients. Synthesis and release of melatonin 
by the pineal gland is driven by noradrena- 
line that is released by the sympathetic 
nervous system and acts on B-adrenocep- 
tors. Ziegler et al. observed fewer sympa- 
thetic axons innervating the pineal gland 
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in patients with cardiac disease as well as 
in mouse models of cardiac dysfunction. 
Furthermore, in mice, diurnal rhythms 
were disturbed after surgical removal of 
the superior cervical ganglia (SCG), which 
provide most of the pineal gland sympa- 
thetic innervation, and were rescued by 
melatonin supplementation. 

To interpret these results, sympathetic 
fiber loss should be viewed in the context 
of global changes in sympathetic neuron 
function during cardiac disease. Because 
sympathetic neurons tend to be hyperac- 
tive during cardiovascular disease (5, 6), 
it is possible that the reduction in sym- 
pathetic tone is less substantial than the 
decrease in the number of innervating ax- 
ons suggests. Other phenomena such as an 
increased release of neuropeptide Y (NPY) 
(7), which is co-released with noradrena- 
line from sympathetic neurons, or a con- 
version of noradrenaline-releasing axons 
into acetylcholine-releasing axons during 
cardiac disease (8) may also contribute to 
changes in melatonin release. 

Ziegler et al. observed fibrotic scarring 
and hypertrophy of the SCG in a mouse 
model of cardiac disease. This was accom- 
panied by an increase in infiltrating mac- 
rophages and a reduction in the number of 
pineal-innervating melatonin receptor 1A 
(Mtnria)-expressing neurons, which was 
assumed to reflect a loss of the neurons 
rather than a decrease in Mtnria expression. 
Hypertrophy and fibrotic scarring were also 
observed in postmortem samples of the SCG 
from patients with cardiac disease, and hy- 
pertrophy in patients was confirmed in situ 
using ultrasound measurements. 


Ziegler et al. suggest that the infiltrating 
macrophages were responsible for the loss 
of pineal-innervating sympathetic neu- 
rons (see the figure). Consistent with this 
hypothesis, depleting macrophages with 
clodronate or preventing macrophage ac- 
tivation (with cobra venom factor) reduced 
the depletion of sympathetic fibers and the 
decrease in melatonin levels that occurred 
in one of the mouse models used. However, 
it should be noted that clodronate also 
has the potential to affect 
neutrophil function (9). In 
addition, prior work has 
shown that sympathetic neu- 
ron-associated macrophages 
interact with SCG neurons 
during normal physiology, 
including by regulating the 
clearance of noradrenaline 
(10). Thus, the relationship 
between pineal-innervating 
sympathetic neurons and 
the immune system may be more compli- 
cated than the simple removal of neurons 
by macrophages, and further work should 
probe how this relationship changes in 
cardiac disease. 

Functional characterization of the 
Mtnrila-positive subpopulation of SCG 
neurons would be a valuable next step. 
Interestingly, there are also some Mtnrla- 
positive neurons in the stellate ganglia (17). 
These ganglia provide most of the cardiac 
sympathetic innervation and are unlikely 
to innervate the pineal gland, raising the 
question of what role these neurons play. 
Previous work has divided sympathetic 
neurons according to low and high NPY 


From heart disease to disordered sleep 

Under healthy conditions, the superior cervical ganglia (SCG) innervate the pineal gland, which releases melatonin 
to aid sleep. Cardiac dysfunction (1) leads to macrophage infiltration (2) of the SCG, which also provide 
asmall amount of sympathetic innervation to the heart. This causes a loss of neurons (3) innervating the 
pineal gland (4), decreasing melatonin secretion and altering sleep (5). Question marks denote uncertainty 
over how macrophages are stimulated to infiltrate the SCG. 
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expression in both the SCG and the stellate 
ganglia. Ziegler et al. found that a similar 
NPY expression difference exists between 
pineal-innervating (low NPY) and “other” 
(high NPY) sympathetic neurons, so it 
would be interesting to understand how 
the Mtnrla-expressing subtype described 
here fits within this framework. This may 
have functional implications because NPY- 
expressing and non-NPY-expressing SCG 
neurons have different electrophysiologi- 
cal properties (72). Another 
puzzle is why Ziegler et al. 
observed lower expression 
of genes encoding noradren- 
aline-synthesizing enzymes 
in pineal-innervating sym- 
pathetic neurons compared 
with sympathetic neurons 
that innervate other organs. 
Answering these questions . 
will require a comprehen- 
sive functional study of the 
pineal-innervating sympathetic neurons. 

It remains to be understood how the de- 
nervation phenotype observed by Ziegler et 
al. develops and whether other end-organ : 
targets of the SCG, such as the heart, lac- 
rimal and salivary glands, iris, and eyelid- 
raising muscles, are also denervated dur- 
ing cardiac disease. Ziegler et al. did not 
observe macrophage infiltration in the © 
(non-cardiac innervating) abdominal gan- 
glia of mice. Why, after cardiac disease, 
macrophages selectively target specific 
non-cardiac-innervating neurons, in a 
ganglion that provides a minority of the 
total sympathetic innervation to the heart, 
remains elusive. 

The findings of Ziegler et al. improve un- 
derstanding of the reduction in melatonin 
levels that occurs during cardiovascular *‘ 
disease and identify a mechanism by which ° 
sympathetic neurons and immune cells may 
interact. Understanding the mechanism and 
specificity of this interaction may lead to ° 
therapies for the sleep disruption that is as- 
sociated with heart disease. 
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Air quality policy should 
quantify effects on disparities 


New tools can guide US policies to better target and reduce 
racial and socioeconomic disparities in air pollution exposure 
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Patterson®, Allen L. Robinson®, Christopher 
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n countries around the world, exposure 

to environmental pollutants commonly is 

unequal across communities, leading to 

disparities in harm to human health. Of- 

ten those facing the highest burdens have 

lower socioeconomic status and are from 
historically marginalized groups. Although 
these inequities are being increasingly recog- 
nized, eliminating them has proven difficult. 
In the United States, the Biden Administra- 
tion’s Justice40 Initiative uses the Climate and 
Economic Justice Screening Tool (CEJST) to 
identify disadvantaged communities and pri- 
oritize them for government programs and 
funding based on climate and environmental 
burdens and socioeconomic indicators. We 
found that although application of CEJST 
to guide ambient air pollution emission re- 
ductions may eliminate the modest exposure 
disparities by income and for disadvantaged 
communities, it may not ameliorate the fre- 
quently larger disparities by race-ethnicity. 
Effectively reducing or eliminating exposure 


disparities will require regulatory decision- 
makers to measure and report exposure dis- 
parities and assess how proposed policies 
may affect those disparities. 

Ambient air pollution is one of the larg- 
est environmental risk factors in the United 
States, causing an estimated 100,000 prema- 
ture deaths each year, which corresponds to 
billions of dollars of health damage each day. 
Although there have been substantial im- 
provements in ambient air quality in recent 
decades, disparities in exposure have been 
remarkably persistent (7-4), suggesting that 
new approaches beyond the Clean Air Act 
and other current regulatory mechanisms 
are needed to reduce these disparities. 

In most cases, the largest exposure dispari- 
ties are by race-ethnicity, which represent a 
major environmental injustice. Disparities by 
other attributes (such as income, age, or edu- 
cation) are relevant but are generally much 
smaller than and statistically distinct from 
disparities by race-ethnicity (3, 4). Disparities 
by race-ethnicity exist in every US state, are 
seen for nearly all air pollutants and catego- 
ries of emission sources, and have continued 
across multiple decades (3, 5, 6). An impor- 
tant underlying cause is racist policy, land- 
use planning, and regulatory actions (for ex- 


Predicted PM, , exposure and attributable deaths 


Average PM, , exposure (left y axis) and attributable deaths (right y axis) for “business as usual” scenario, 
disaggregated by disadvantaged community status, race-ethnicity, and income. 
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ample, refusal to offer loans and insura ed 


“redlining,’ exclusionary zoning, racial ‘cov-— 
enants, and decades of disparities in regula- 
tory oversight and enforcement) (7-11). 

The Justice40 Initiative is a cornerstone 
of the Biden Administration’s effort to ad- 
dress environmental injustice. Its stated 
goal is that disadvantaged communities that 
are marginalized, underserved, and over- 
burdened by pollution receive at least 40% 
of the overall benefits of certain federal in- 
vestments. Justice40 is using CEJST to in- 
form allocation of tens of billions of dollars, 
across hundreds of government programs 
(such as in clean energy and transportation, 
workforce development, and remediation of 
legacy pollution) [see supplementary materi- 
als (SM)]. However, the extent to which this 
strategy addresses environmental disparities 
remains unstudied. 

We investigated how using CEJST to target . 
emission reductions might affect ambient air 
pollution exposures and exposure disparities. 
We found that application of CEJST may not 
ameliorate (and in some cases may increase) 
exposure disparities by race-ethnicity. This 
outcome likely reflects that CEJST does not 
explicitly use race-ethnicity as a factor to de- 
fine disadvantaged communities. This find- 
ing also highlights the broader problem of 
insufficient investigation of how existing or 
proposed policies will affect disparities in en- 
vironmental outcomes. 

Our calculations predict annual-average 
particulate matter (PM, .; particles in the air 
with diameter 2.5 4m or smaller) concen- 
trations throughout the contiguous United 
States according to the emissions of each 
chemical component of PM, . [primary (di- 
rectly emitted) and secondary (formed in the 
atmosphere from precursors, such as am- 
monia or nitrogen dioxide)] and from each 
sector of the economy. We focused on PM, . 
because of the large monetized health dam- 
ages (the largest of any ambient air pollut- 
ant); because it has an intermediate level of + 
disparities among air pollutants (3); because . 
it is one of the measures used in CEJST to 
identify disadvantaged communities; and 
because of the availability of data and models 
(see SM). We considered three future 20-year 
emission scenarios. In the “business as usual” 
(BAU) scenario, historical rates of emissions 
and emission changes (by PM, , component 
and sector of economy) are continued into 
the future (by using linear extrapolation) 
as though the Justice40 initiative had not 
been implemented. This BAU is a plausible 
estimate for the isolated effects of future air 
pollution regulatory approaches. For the sec- 
ond and third scenarios, we modeled that in 
disadvantaged communities, the Justice40 
initiative leads to a doubling or quadrupling, 
respectively, of historical rates of emission 
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reduction. (Herein, the term “disadvan- 
taged communities” refers to Census Tracts 
identified by CEJST.) In all scenarios, non- 
Justice4O communities experience histori- 
cal BAU reduction rates. The doubling and 
quadrupling scenarios represent aggressive 
or very aggressive emission reductions in 
disadvantaged communities (see SM). Those 
additional emission reductions in disadvan- 
taged communities could reflect, for exam- 
ple, upgrading, modernizing, or retrofitting 
older equipment; more stringent monitoring 
and enforcement of existing requirements; 
efficiency improvements; pollution-control 


tional coverage at high spatial resolution, as 
small as 1 km in urban centers. Future emis- 
sions were estimated on the basis of the his- 
torical National Emission Inventories from 
the US Environmental Protection Agency 
(EPA). The population and demographic 
composition in the baseline year were ap- 
plied into the future. Exposure to PM, . con- 
tributes to morbidity and premature mortal- 
ity by increasing rates of heart attack, stroke, 
lung cancer, respiratory infections, and 
more. In this work, we only considered in- 
creases in mortality, which contributes most 
of the monetized health impacts of ambi- 


Disparities in PM, , exposure and deaths for the three future scenarios 


Disparities relative to the population-average, for “business as usual” (BAU) scenario and when emission- 


reductions in disadvantaged communities (DAC) are double or quadruple the BAU rate. Top row, absolute disparities; 


bottom row, relative disparities. 
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devices; and granting of fewer permits for 
new sources. If in reality Justice40 turns out 
to be less spatially targeted than the dou- 
bling or quadrupling scenarios, then the true 
outcome from Justice40 may be between the 
BAU and the doubling or quadrupling sce- 
narios; in that case, core conclusions of this 
article would still hold. 

We analyzed the effect of these scenarios on 
human exposure using a reduced-complexity 
chemical transport model [Intervention 
Model for Air Pollution (InMAP)] to predict 
how changes in emissions would alter PM, . 
concentrations and concentration dispari- 
ties. INMAP simulates the fate and transport 
of anthropogenic emissions leading to pri- 
mary and secondary PM, . and provides na- 
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ent air pollution. We do not expect the core 
conclusions to change if we were to consider 
additional health endpoints. We assessed 
disparities (absolute and relative differences 
in population-average exposures between a 
demographic group and the overall popula- 
tion) (see SM) (/-4, 12) for four groups: (i) 
people living in disadvantaged communities; 
(ii) people with low income (people in house- 
holds with incomes at or below two times the 
poverty level); (iii) people of color [all people 
except non-Hispanic (NH) whites]; and (iv) 
the most exposed racial-ethnic group of 
the four groups considered (NH white, NH 
Black, NH Asian, and Hispanic). 

In the current state [baseline year (see the 
first figure, year “O”), before applying any 
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new emission-reductions], InMAP results in- 
dicate that average exposure to PM, is ~14% 
higher for people of color than for the overall 
population. The Black population is currently 
the most exposed racial-ethnic group (dis- 
parity relative to population-average: Black, 
+20%; Asian, +14%; Hispanic, +10%; white, 
-7%). Disparities by race-ethnicity are larger 
than disparities for disadvantaged communi- 
ties (~6% higher than population-average) or 
by low-income status (~3% higher than pop- 
ulation-average). Those results from InMAP 
are consistent with findings from an empiri- 
cal model (see SM) (3). 

As expected, for all three emis- 
sion reduction scenarios, all demo- 
graphic groups experience cleaner 
air in the future. However, two key 
findings emerge with respect to 
exposure disparities. First, under 
BAU, exposure disparities by race- 
ethnicity persist. As emission reduc- 
tions occur, PM,,. concentrations 
decrease at slightly different rates 
for different groups. For example, 
InMAP predicts that Asian people 
will soon become the most exposed 
group. However, concentrations re- 
main higher than average for Black, 
Hispanic, and Asian populations 
(see the first figure). In addition, 
racial-ethnic disparities persist and 
remain much larger than dispari- 
ties for disadvantaged communities 
and for low-income households (see 
the first figure). This finding under- 
scores that new regulatory strate- 
gies for emission reduction (deviat- 
ing from BAU) are needed to reduce 
emissions in ways that also address 
exposure disparities. 

Second, the two scenarios with 
enhanced emission reductions in disadvan- 
taged communities eliminate absolute and 
relative disparities for disadvantaged com- 
munities and for low-income populations. 
Yet these scenarios do not reduce the com- 
paratively larger relative disparities by race- 
ethnicity (although they do decrease absolute 
disparities) (see the second figure). Scenarios 
two and three increase the relative exposure 
disparity for the most exposed racial-ethnic 
group (see the second figure), relative to pres- 
ent-day and the BAU future. The result sug- 
gests that the enhanced emission-reductions 
in disadvantaged communities has more 
exposure benefits for the overall popula- 
tion than for the most exposed racial-ethnic 
group. This outcome could be interpreted as 


Increase in attributable deaths 
per 100,000 people per year 
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undermining a core environmental justice 
goal: eliminating exposure disparities by 
race-ethnicity. 

Our findings are robust to several sensitiv- 
ity analyses, including considering alterna- 
tive methods and outcomes (see SM). Results 
from the sensitivity analyses indicate that 
only with enhanced emission reduction in 
or upwind of communities of color will both 
absolute and relative racial-ethnic disparities 
in exposure to PM, air pollution be reduced. 
Our findings regarding BAU are also sup- 
ported by concentration forecasts by using 
a high-resolution empirical-model (see SM), 
suggesting that results here for BAU are not 
strongly dependent on the emission inven- 
tory nor on InMAP. 

Failure of CEJST-directed emission re- 
ductions to directly address the largest 
source of exposure disparities (those by 
race-ethnicity) would very likely undermine 
the Biden Administration’s environmental 
justice goals. Compared with the national 
average, disadvantaged communities identi- 
fied by the current CEJST are composed of 
only modestly higher proportions of people 
of color (especially Black, Hispanic, and 
Indigenous populations) and low-income 
populations (see SM). This decision to ex- 
clude race-ethnicity as an indicator in CEJST 
reflects in part concern about potential po- 
litical and legal challenges if a federal policy 
or tool explicitly includes race as a factor for 
guiding Justice40 investments (for example, 
see the 29 June 2023 Supreme Court decision 
disallowing use of race as a factor in college 
admission decisions). Nevertheless, present- 
day racialized exposure disparities reflect 
in part decades of racist policy and practice 
(8, 10, 11). Because legacies of race-based 
actions helped create this problem, solving 
it is made more difficult if the government 
does not consider, or bars itself or is legally 
barred from considering, information about 
the racial makeup of communities as part of 
its decision-making and action. Tackling the 
challenges posed here will require system- 
atic assessments of how proposed regulatory 
strategies and tools would affect exposure 
disparities and whether racial-ethnic expo- 
sure disparities can be eliminated within a 
reasonable time frame (for example, in less 
than a decade). Our analysis provides a proof 
of concept of this sort of regulatory scenario 
testing and evaluation and demonstrates 
that new tools such as InMAP enable such 
analyses (6, 13, 14). 

Air quality regulation can be more effec- 
tively designed to improve overall air quality 
while also eliminating air pollution exposure 
disparities by race-ethnicity. This dual goal 
can be supported by regulatory impact analy- 
ses for air pollution that quantify whether 
and how relevant policies will not only affect 
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air quality but also reduce absolute and rela- 
tive exposure disparities. For example, the 
EPA’s “Status and Trends” reports, other regu- 
latory information, and accountability stud- 
ies should quantify disparities or exposures 
for overburdened communities. Although 
politically challenging, emission reduction 
efforts must address disparities by race-eth- 
nicity if we wish to uphold everyone's right to 
breathe clean air. 

Overall, Justice40 aims to address mul- 
tiple challenges, not just exposure to PM... 
Although we found that using the current 
CEJST to target emission-reductions will not 
eliminate racial-ethnic disparities in PM, . 
exposure and attributable mortality, there 
likely will be other environmental and justice 
benefits from Justice40, including economic 
opportunities from investments and building 
resilience to climate change in disadvantaged 
communities. At the same time, disadvan- 
taged communities, as defined by CEJST, 
comprise ~34% of the US population. The 
goal of delivering 40% of benefits to 34% of 
the population represents a modest deviation 
from an exactly proportional share of the 
Justice40 benefits. The finding that relative 
exposure disparities by race-ethnicity will 
not decrease (and may increase) with use of 
CEJST indicates that additional and more 
targeted actions will be needed to end racial- 
ethnic exposure disparities. For example, fu- 
ture iterations of CEJST could use a different 
set of locations or could aid in better targeting 
investments (that is, differentiating among 
CEJST locations). Other policies, including 
by states (such as in California, New Jersey, 
and Washington), also aim to address envi- 
ronmental disparities; the effectiveness of 
those policies at reducing disparities should 
also be evaluated as we have done here. 

Our analysis has several implications. First, 
the EPA and other agencies should quantify 
how proposed programs, regulations, and 
decision-making tools would affect environ- 
mental justice outcomes, especially exposure 
disparities by race-ethnicity. If possible, this 
should be undertaken when such initiatives 
are being developed, not after. Previously, in 
the realm of air quality, this type of national 
analysis would have been difficult to do be- 
cause of the computation costs and spatial- 
resolution limitations of many air quality 
models. However, recently developed air 
quality models such as NMAP make this type 
of analysis faster and easier to carry out and 
provide national coverage at much higher 
spatial resolution than many conventional 
models. One can do similar types of calcu- 
lations with conventional models, but their 
higher computational cost hinders analysis, 
and their coarser resolution may mean that 
the results would underestimate total dispar- 
ities. One can use reduced-complexity models 


to rapidly examine the impacts of multiple 
policy options on exposure disparities to help 
to design optimal control strategies. 

Second, our prior research indicates that 
in theory, location-based approaches can ef- 
ficiently eliminate exposure disparities by 
race-ethnicity within a reasonable time frame 
(12). We did not find a trade-off between re- 
ducing disparities and reducing overall air 
pollution averages. More work is needed to 
identify the most effective policies and strate- 
gies for achieving location-specific emission 
reductions. This could identify a new set of 
locations or a more targeted approach to 
emission reductions in those locations. 

Last, current approaches in the Clean Air 
Act have been effective at reducing average 
concentrations and absolute exposure dis- 
parities (15), but relative disparities are gen- 
erally ignored and have persisted (J—4, 12). If, 
as our results suggest, neither BAU nor the 
present CEJST can eliminate racial-ethnic 
exposure disparities, then new regulatory 
strategies are needed to advance environ- 
mental justice goals. 
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Honoring Anarcha 


An attempt to give voice to an enslaved subject of 
medical experimentation falls short 


By Jim Downs 


he story of how the 19th-century 
American physician J. Marion Sims 
experimented on enslaved women and 
became the so-called “Father of Gyne- 
cology” has been well documented by 
several scholars, most powerfully by 
historian Deirdre Cooper Owens in her 2017 
book, Medical Bondage, which centers the 
enslaved women as key historical 
actors (J). In his new book, Say 
Anarcha, J. C. Hallman attempts 
to enrich this story by focusing 
on one of the women, Anarcha, 
on whom Sims performed an esti- 
mated 30 experiments in an effort 
to cure her of obstetric fistula, a 
condition in which holes form be- 
tween the birth canal and bladder 
after prolonged, obstructed labor. 
To accomplish this, Hallman 


SAY} 


Say Anarcha: 
A Young Woman, 
a Devious Surgeon, 
and the Harrowing 


because the subjects were often elderly 
when they recounted childhood memories 
to white interviewers who frequently re- 
worded their testimony. He refers to the 
resulting biography as “a comprehensively 
researched work of speculative nonfiction” 
that includes invented scenes, words, ac- 
tions, expressions, motivations, thoughts, 
and even dreams. 

Anarcha herself left no written record, and 
in the absence of her own words, 
Hallman selected excerpts from 
various formerly enslaved people 
to give her voice, even going so far 
as inferring her mindset and emo- 
tions in certain passages. In one 
pivotal scene, for example, when 
Sims prepares to perform his 
first experiment on her, Hallman 
writes, “Anarcha could see that he 
was earnest.” However, in his at- 
tempt to infuse narrative flow into 


scoured local archives, census re- Birth of Modern his story, Hallman has committed 
cords, and plantation ledgers. He aie ison a serious crime of historical mis- 
also drew heavily on a database Henry Holt, 2023. representation, turning Anarcha 
of oral interviews with formerly 4A8 pp. into a puppet ventriloquizing the 


enslaved Black Southerners con- 

ducted during the Great Depression (2). 
Many historians warn of the dangers of 
reading these records as literal evidence, 
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experiences of a generation of en- 
slaved people born after her. Throughout the 
book, Hallman takes details about Anarcha’s 
life and uses them as prompts to embellish 
his novelistic predilections. 

The book also makes many avoid- 
able and insensitive errors. In describing 
Anarcha’s childhood, for example, Hallman 
writes, “there was little distinction be- 


A monument in Montgomery, Alabama, recognize 
Anarcha and two other experimental subjects. 


tween the white and black children on the 
Westcott plantation,” failing to acknowledge 
that, at any point, Anarcha could have been 
sold away from her mother and father, or 
her parents sold away from her. He further 
adds, “some of the children did not real- 
ize they were slaves. But Anarcha always 
knew it and she knew it best from the food.” 
Hallman chose an account by two formerly 
enslaved individuals in Texas in the 1930s 
to assert this sentiment. His reasoning for 
doing so is unclear. 

In historical records, Sims referred to 
Anarcha and his other patients as “the 
cursed,’ a phrase Hallman chooses to use 
as well, despite ostensibly seeking to recog- 
nize the humanity of enslaved people. The 
most insensitive passages, however, appear 
when Hallman describes the physician’s 
first encounter with a fistula. “Sims felt the 
thrill of a mariner at the first glimpse of an 
undiscovered land,” he writes—a prurient 
description consistent with other later pas- 
sages (“Anarcha climbed onto the table... 
and Sims slid the cold tongue of his specu- 
lum inside of her”). 

The dearth of evidence to adequately 
chart the history of an enslaved person’s 
life represents a frustrating challenge that 
every scholar who studies slavery must 
face. Within African American studies, 
scholars have devised various theories, 
methods, and approaches to address these 
silences, recognizing that the violence of 
slavery extended beyond brutal bodily as- 
saults and produced epistemic violence 
that intentionally eradicated any trace of 
Black people’s subjecthood that did not 
serve as a function of their commodifica- 
tion or production. The goal of medical his- 
torians working on this subject, as of late, 
has been to read the surviving evidence for 
clues that could provide a semblance of in- 
formation about the lives of enslaved Black 
patients. The prospect of writing a useful 
and historically informed book on Anarcha 
is thus not impossible, but it requires a 
more careful approach than Hallman takes, 
one that situates the extant evidence about 
her life in the context of what slavery histo- 
rians and historians of medicine have writ- 
ten about this subject. @ 
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Simulating the Universe 


An accessible introduction to computational cosmology 
offers fodder for profound reflection 


By Tom Abel 


e practitioners of computational 

cosmology sometimes forget the 

absurd grandiosity of our ambi- 

tion. We endeavor to understand 

the interplay of the quantum me- 

chanical processes that set the 
initial conditions of the Universe, as well as 
the particle physics, symmetries, and nuclear 
physics that shaped its contents and its early 
history. How are stars able to form? What 
about black holes? How do they assemble 
into galaxies? Atomic and 
molecular physics,  ra- 
diative processes, plasma 
physics, magnetic fields, 
cosmic rays, and more add 
character and flavor to our 
understanding of the cos- 
mos. Computational sci- 
ence and engineering are 
essential too. In The Uni- 
verse in a Box, cosmologist 
Andrew Pontzen success- 
fully undertakes an impres- 
sive expedition into two 
versions of the cosmos— 
the one observed, and the 
one created and studied 
on  supercomputers—ren- 
dering the complexities of 
astrophysical phenomena 
and cosmological simula- 
tions both comprehensible 
and captivating. 

Pontzen draws _paral- 
lels between cosmological 
simulations and weather forecasts, paving 
the way for nonexpert readers to comfort- 
ably delve into the complexities of cosmic 
simulations. By extrapolating the daily rel- 
evance of meteorological predictions to the 
grandeur of universe simulations, he beauti- 
fully humanizes the abstract, immersing his 
audience in a narrative as intriguing as it is 
profound. 

The book provides a wonderful account 
of Erik Holmberg’s groundbreaking 1941 ex- 
periment in which he ingeniously employed 
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lightbulbs and photodetectors to simulate 
the collision of two spiral galaxies. It also 
pays a deserving tribute to trailblazers often 
overlooked in the annals of astrophysical his- 
tory. His recounting of Beatrice Tinsley’s and 
Vera Rubin’s substantial contributions to our 
understanding of the Universe is insight- 
ful and respectful. By acknowledging these 
and many more pioneering women cos- 
mologists, Pontzen not only adds a layer of 
richness to his narrative but also boldly ad- 
dresses the historic gender imbalance in the 
field and the societal obstacles overcome by 


Simulations like this one of stars in the early Universe can complement astronomical observations. 


contributors from underrepresented groups. 

As the narrative unfolds, Pontzen draws 
readers into a world of existential questions 
and endless possibilities. He provides a 
clear explanation of the surprisingly popu- 
lar theory that we may inhabit a simulated 
reality, sharing his balanced perspective on 
this hypothesis. Instead of hasty dismissal 
or blind endorsement, he offers a thought- 
ful argument that encourages readers to 
contemplate the Universe’s complexity and 
our place within it, all without straying 
into the realm of science fiction. While The 
Universe in a Box does delve into complex 
themes, Pontzen’s clear prose and illustra- 
tive examples ensure the book’s accessibility 
to a wide audience. 


The Universe in a Box: 
Simulations and the Quest 
to Code the Cosmos 
Andrew Pontzen 

Riverhead Books, 2023. 

272 pp. 


The book also offers timely commentary 
on how future discoveries in astrophys- 
ics and cosmology will likely be aided by 
advancements in artificial intelligence 
and machine learning. After a_histori- 
cal recounting of Joshua Lederberg and 
Edward Feigenbaum’s work on DENDRAL 
(Dendritic Algorithm)—the legend- 
ary chemical analysis program from the 
1960s—it gives a thoughtful description of 
the nuances of Bayesian statistics, includ- 
ing examples from the author’s own origi- 
nal work. Here, Pontzen shares valuable 
insights that will prove 
useful to anyone trying to 
keep up with the explo- 
sion of machine learning 
techniques in modern as- 
tronomical data analysis 
and interpretation. 

Despite its strengths, 
the book leaves some 
stones unturned. Expert 
readers may feel that sev- 
eral historical connections 
and broader contexts are 
underexplored, including 
the military-industrial 
complex’s role in early 
high-performance com- 
puter development that 
was crucial for the devel- 
opment of tools now used 
in the field of computa- 
tional cosmology. A very 
large opportunity missed, 
however, is the book’s 
complete lack of illustra- 
tions and visualizations, without which the 
reader cannot easily appreciate how well 
simulations can sometimes reproduce ob- 
servations of the actual Universe. 

Overall, however, The Universe in a 
Box is a compelling exploration of scien- 
tific discovery, historical context, and the 
philosophical questions prompted by the 
creation of virtual universes. It is a brisk 
read, a heartfelt recounting of our ongoing 
efforts to uncover the Universe’s secrets, 
and a veritable treasure chest filled with 
captivating stories seldom shared with 
nonexperts that offers a profound reflec- 
tion on our human quest to understand 
the cosmos. 
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Invasive species, such as goldenrod (Solidago sp.), have contributed to biodiversity loss on abandoned land in central Europe. 


Edited by Jennifer Sills 


Abandoned land: Linked 
to biological invasions 


In their Perspective “Abandoning land 
transforms biodiversity” (12 May, p. 

581), G. N. Daskalova and J. Kamp argue 
that abandoned land offers benefits for 
nature conservation. However, research 
from Eastern and Central Europe (J-5) 
reveals that land abandonment can lead 
to an influx of invasive species. Although 
Daskalova and Kamp acknowledge in 
their figure that invasive species could 
influence conservation success, they 

do not sufficiently emphasize the risks. 
Agricultural land, which is less vulnerable 
to invasions than abandoned land, may 
offer more conservation benefits. 

Central Europe experienced vast land 
abandonment after the collapse of com- 
munism in 1990s. For example, 12% of 
Poland’s agricultural land from that era 
is now abandoned (6). Up to 75% (3) of 
that abandoned land is now dominated 
by invasive plant species such as golden- 
rod (Solidago sp.) (2, 3), walnut (Juglans 
regia) (5), and boxelder maple (Acer 
negundo) (5). Goldenrod induces biodiver- 
sity decline and hinders the natural pro- 
cess by which the structure of a biological 
plant community evolves from grassland 
into forest over time (7). As a result of 
goldenrod growth, wild pollinator abun- 
dance in abandoned land has decreased 
by 60 to 70% (J-3), and bird abundance 
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has decreased by 50% (2, 3). Furthermore, 
invasive species spill over from aban- 
doned land into agricultural land, where 
invasions decrease crop yields (7-9). 
Agricultural land management can be 
less harmful to biodiversity than aban- 
doned land when considering the impact 
of invasive species (1-4, 7-9). Agricultural 
cultivation can limit invasive species 
spread and prevent plant invasions (2, 
3, 5). Extensively managed agricultural 
land, such as grasslands and pastures, has 
higher biodiversity than invaded aban- 
doned lands (2-6). Implementing a land- 
sharing policy that minimizes abandoned 
lands and instead maintains low-intensive 
agricultural lands could prevent inva- 
sions, benefit biodiversity, and sustain 
an economy that can support people’s 
livelihoods. 
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Abandoned land: 
Overestimated potential 


In their Perspective “Abandoning land 
transforms biodiversity” (12 May, p. 581), 
G. N. Daskalova and J. Kamp suggest 

that rural depopulation, environmental . 
degradation, and other factors have led to 
vast areas of land globally that are avail- 
able for restoration. We agree that careful 
planning and management are needed 

to conserve biodiversity on lands where 
human activities have ceased. However, 
by oversimplifying the socioeconomic 
processes that affect whether land is truly 
abandoned, the authors overestimate the 
potential of these lands for biodiversity 
conservation. 

Landholder decisions, mediated by 
property rights that are often complex, 
drive land cover changes (J-3). These 
choices affect the cessation and frequent 
recurrence of land use. In many loca- 
tions around the world, land considered 
to be “abandoned” is more likely to be 
277 
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subsequently used by humans than 
allowed to regenerate to a more natural 
ecosystem (4-6). 

Land abandonment and reversals 
result from complex socioeconomic 
dynamics including changes in agricul- 
tural prices and technologies, off-farm 
employment opportunities, political 
instability, and government policies (2, 
3, 7). Although forest transition theory 
assumes that people abandon land when 
migrating to cities, those lands are often 
used for other purposes (7), and there are 
many examples of reverse migration to 
rural areas when economic opportunities 
or personal situations change (3, 8). In 
cases where landholders are forced from 
their land due to land grabs or war, land 
is certainly not abandoned. 

Successfully planning land uses to 
simultaneously support biodiversity 
conservation and human livelihoods 
must start by understanding the social 
systems and processes that lead to land- 
use change—a task that requires on-the- 
ground knowledge and is not well-suited 
to remote sensing. Without considering 
landholding status and associated liveli- 
hoods, rewilding projects on “abandoned” 
land are unlikely to succeed. 
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Response 


The views of Lenda et al. and Holl e¢ al. 
about land abandonment, biodiversity, and 
people largely align with ours. To advance 
our understanding of the short- and long- 
term impacts of abandonment, biodiversity 
research and policy should explicitly define 
spatial, temporal, and taxonomic scales; dis- 
tinguish between passive and active manage- 
ment of abandoned land; and acknowledge 
differences in landscape heterogeneity and 
socioeconomic context. 

Global change drivers such as land aban- 
donment have heterogeneous biodiversity 
impacts (J, 2). In our Perspective, we discuss 
both positive and negative effects of aban- 
donment, including the spread of invasive 
species on abandoned land to which Lenda 
et al. refer. Given that abandoned agricul- 
ture has mostly been overlooked in global 
assessments (3), there is limited evidence 
that abandonment always triggers suc- 
cessful invasions. Thus, we disagree with 
Holl et al. that we have overestimated the 
potential of abandoned land for conserva- 
tion, as we do not yet know the full scope 
of opportunities and threats that abandon- 
ment creates for biodiversity. Although land 
abandonment is often ephemeral (4, 5), its 
duration can vary substantially. Short-term 
abandonment can still benefit biodiversity 
(6); a decade would, for example, capture 
several generations of insects and birds. 
Abandonment of 30 to 49 years, which 
occurs commonly across Eurasia, also has 
positive biodiversity effects (5). Invasive 
species can indeed become dominant on 
abandoned land as outlined by Lenda et al., 
but not all abandoned land loses its eco- 
logical and ecosystem services value when 
invasives are present (7). 

Our Perspective looks globally at the 
many possible trajectories for biodiver- 
sity after abandonment, which can occur 
without human interference (passive resto- 
ration) and with active involvement, such as 
reintroduction schemes (8). Because not all 
land has a known owner or is actively man- 
aged, we do not consider land ownership 
and abandonment to be mutually exclusive. 
Assessing biodiversity change after aban- 
donment should include a determination 
of whether land is under passive or active 
management, the abandonment duration, 
and the likelihood of recultivation. 

We disagree with Lenda e¢ al. that farm- 
land is more resistant to invasions than 
abandoned land. Low-intensity agricultural 
land, often a refuge for declining biodiver- 
sity, has more invasive species compared 
with intensively used land (3). Human trade 
and transport are strong predictors of inva- 
sive species spread and establishment (9, 


10), and both decrease where abandonment 
is paralleled with depopulation. Current 
land abandonment hotspots, such as the 
former Soviet Union, host comparatively 
few invasive species due to low human 
population density and less ecosystem dis- 
turbance after rural out-migration (17). 
Abandonment occurs in many social 
and cultural contexts (72), which must be 
considered to achieve maximum benefits 
of abandoned land for people and nature. 
In places such as Kazakhstan, vast areas of 
abandoned land are state-owned and parts 
of them have already been designated as 
protected areas (13). In contrast, abandoned 
land in Bulgaria is predominately privately 
owned, with one plot often having upwards 
of 20 heirs, hindering unified land plan- 
ning. Given the many unknowns surround- 
ing the current and future relationships 
between humans and nature in transitional . 
rural landscapes, it is important to look 
for more than one type of solution (14), 
depending on scale, context, and people’s 
evolving needs. 


Gergana N. Daskalova!* and Johannes Kamp? 
1Biodiversity, Ecology, and Conservation Group, 
International Institute for Applied Systems c 
Analysis, 2361 Laxenburg, Austria. Department 

of Conservation Biology, University of Gottingen, 
37073 Gottingen, Germany. 

*Corresponding author. 

Email: gndaskalova@gmail.com ‘ 


REFERENCES AND NOTES 


1. C. Queiroz, R. Beilin, C. Folke, R. Lindborg, Front. Ecol. 
Environ. 12,288 (2014). 
2. IPBES, “Global assessment report on biodiversity 
and ecosystem services of the Intergovernmental 
Science-Policy Platform on Biodiversity and Ecosystem 
Services,” E. S. Brondizio, J. Settele, S. Diaz, H.T. Ngo, 
Eds. (Bonn, Germany, 2019). 
. Liuet al., Nat. Commun. 14, 2090 (2023). 
.L. Crawford, H. Yin, V.C. Radeloff, D.S. Wilcove, Sci. 
Adv. 8, eabm8999 (2022). 
.Plieninger, C. Hui, M. Gaertner, L. Huntsinger, PLOS 
ONE 9, €98355 (2014). ‘ 
6. J.Kamp, R. Urazaliev, P. F. Donald, N. Hélzel, Biol. C 
onserv.144, 2607 (2011). 
Pejchar, H. Mooney, in Bioinvasions and Globalization, 
.Perrings, H. Mooney, M. Williamson, Eds. (Oxford 
University Press, Oxford, ed. 1, 2009), pp. 161-182. 
.Perino etal., Science 364, eaav55/0 (2019). . 
. H.Seebens et al., Proc. Natl. Acad. Sci. U.S.A. 115, e2264 
(2018). 
10. P.PySeketal., Proc. Natl. Acad. Sci. U.S.A. 107, 12157 
(2010). 
ll. A.J. Turbelin, B.D. Malamud, R.A. Francis, Glob. Ecol. 
Biogeogr. 26, 78 (2017). 
12. C.Quintas-Soriano, A. Buerkert, T. Plieninger, Land Use 
Pol.116, 106053 (2022). 
13. J.Kampetal., J. Appl. Ecol. 52,1578 (2015). 
14. J. Fischer, T. Hartel, T. Kuemmerle, Conserv. Lett. 5,167 
(2012). 


ole) 


a 
4 


olm 


sO 90 
> 


10.1126/science.adj1595 


De 
ERRATA 
Erratum for the Research Article “Garnet 
crystallization does not drive oxidation at arcs” 
by M. Holycross and E. Cottrell, Science 381, 
eadj6418 (2023). 

Published online 14 July 2023 

10.1126/science.adj6418 


science.org SCIENCE 


Edited by Michael Funk 


ICE SHEETS ; 


Greenland unfrozen 


easurements made on subgla- 

cial sediment from the Camp 

Century ice core in northwestern 

Greenland show that the location 

was ice free during the interglacial 
that occurred around 400,000 years ago. 
Christ et al. used luminescence dating and 
cosmogenic nuclide data to show that the 
sediment was deposited under ice-free condi- 
tions after having been exposed at the surface 
to sunlight fewer than 16,000 years earlier. 
The absence of ice at that location means that 
the Greenland Ice Sheet must-have-contributed 
more than 1.4 meters of sea-level equivalent to 
the high sea-level stand, wheh'the average global 
air temperature was similar to what we will soon 
experience because of human-caused climate 
warming. —HJS Science, ade4248, this issue p. 330 


Sediment cores from northwest Greenland 
reveal a deglaciation 400,000 years ago that would 
have contributed substantially to global sea-level rise. 


prefers hexasomes over 


Mechanisms of nucleosomes, that explain 
2 how this enzyme specifically 
hexasome remodeling recognizes and remodels 


The packaging of DNA by 
histone proteins into nucleo- 
somes regulates how genomic 
information is expressed and 
maintained in the nucleus of 
a cell. Hexasomes are non- 
canonical nucleosomes with 
six instead of eight histones. 
Although it is known that 
hexasomes are linked to 
actively transcribed genes, 

it has been unclear how the 
cellular machinery functions 
in the context of hexasomes. 
Zhang et al. report structural 
and mechanistic findings on 
the remodeler INO80, which 


hexasomes. The authors 
suggest that hexasomes not 
only alter the packaging of 
DNA but may also change 
how enzymes and other fac- 
tors interpret the regulatory 
information of chromatin. In 
acomplementary study, Wu 
et al. determined a struc- 
ture showing how INO80’s 
remodeling subunit binds at 
a location called SHL-2 on 

a hexasome, rotated 180° 
compared with its position on 
a nucleosome. Both hexasome 
and nucleosome sliding by 
INO80 requires action from 
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SHL-2, suggesting additional 
steps that slow nucleosome 
sliding. INO80's highly regu- 
lated mechanism explains its 
functional versatility. —DJ 
Science, adf6287, adf4197, 
this issue p. 313, p. 319 


A tunable moiré magnet 
Twisted transition metal dichal- 
cogenide bilayers have been 
predicted to exhibit exotic prop- 
erties. Anderson et al. stacked 
two molybdenum ditelluride 
layers on top of each other in 

a rhombohedral configuration 
and with a twist angle of about 
4° resulting in a moiré structure. 
By varying the carrier density 


and an applied electric field, 
the researchers were able to 
tune the geometry of the lat- 
tice and the nature of magnetic 
interactions. This tunability is 
expected to enable the realiza- 
tion of a host of correlated 
states in this system. —JS 
Science, adg4268, this issue p. 325 


Side-stepping HF 

Fluorine occurs naturally as a 
calcium salt in the highly insol- 
uble mineral fluorspar. For well 
over 200 years, the first step in 
accessing fluorinating reagents 
has been the conversion of 
fluorspar to perilously toxic 
and corrosive hydrofluoric acid 


science.org SCIENCE 
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(HF). Patel et a/. now report 
a safer alternative means of 
activating fluorspar for down- 
stream fluorination chemistry. 
Specifically, they enhanced 
its solubility by grinding it in 
a ball mill with dipotassium 
hydrogen phosphate salt. The 
resultant solid can then be 
used to form carbon-fluorine 
and sulfur—fluorine bonds with 
electrophilic reactants in an 
alcohol solvent. —JSY 

Science, adi1557 this issue p. 302 


AUTOIMMUNITY 
Celiac disease details 


Celiac disease is an autoim- 
mune disease for which the 
triggering antigen, ingested 
gluten, is well defined, thus 
facilitating investigation of 
immunological changes that 
lead to pathology. Kornberg 
et al. performed multiplexed, 
single-cell analysis of intes- 
tinal and peripheral blood T 
cells from patients in dif- 
ferent disease states and 
from healthy controls. They 
identified distinct immune cell 
signatures, including elevated 
CD** follicular T helper cells, 
regulatory T cells, and natural 
CD* aB and yd intraepithelial 
T cells (T-IELs), in patients 
with untreated celiac disease. 
The celiac disease—associ- 
ated T cell receptor repertoire 
observed in natural killer 
receptor—expressing natural 
CD®* af and yd T-IELs showed 
evidence of antigen-mediated 
selection. In response to 
gluten ingestion, a natural 
killer receptor—expressing 
memory T-IEL subset gives rise 
to cytotoxic cells that appear 
to mediate celiac disease— 
associated intestinal damage. 
—CNF 
Sci. Immunol. (2023) 
10.1126/sciimmunol.adf4312 


CANCER 
A sugar-free 


cancer therapy 


Glioblastoma (GBM) is an 
aggressive, uncurable primary 
brain tumor. Immunotherapies 
hold promise as a treatment, 


SCIENCE science.org 


but GBM tumors evolve strate- 
gies to evade the immune 
system. One such strategy the 
tumors use is to hypersialylate 
the surfaces of their cells, 
which simultaneously masks 
them from the immune system 
and tolerizes tumor-associated 
macrophages and microglia 
through sialic acid-binding 
immunoglobulin-like lectins 
(Siglecs). Schmassmann et 

al. found that inhibiting the 
Siglec—sialic acid axis either 
genetically or with an anti- 
body delayed tumor growth 

in mouse models of GBM 

and could be combined with 
immune checkpoint inhibitors. 
Siglec-sialic axis disruption 
further led to immune activa- 
tion and tumor cell death in 
patient-derived GBM explants. 
These encouraging results 
support further study of 

the Siglec-—sialic axis in the 
context of human GBM and its 
potential clinical translation. 
—CSM 


Sci. Transl. Med. (2023) 
10.1126/scitranslmed.adf5302 


CORONAVIRUS 
Anatomy of an epidemic 


The severe acute respiratory 
syndrome coronavirus 2 pan- 
demic changed character with 
the emergence of the Omicron 
lineage in South Africa in 
2021. This lineage showed 
elevated transmissibility and 
increased immune evasion. 
Tsui et al. traced the history of 
the introduction of this variant 
in the United Kingdom, where 
exceptionally comprehensive 
genetic sampling regimes 
were established. The authors 
found that the virus had been 
introduced undetected into 
England between 5 and 18 
November 2021, and South 
African scientists alerted the 
World Health Organization on 
22 November 2021. However, 
by the time the UK government 
had responded, the variant had 
already spread between UK 
cities and globally. Therefore, 
the subsequent travel restric- 
tions placed on southern 
Africa were futile. —CA 

Science, adg6605, this issue p. 336 


ECONOMICS 


The price of housing discrimination 


onstraints on housing choices caused by racial dis- 
crimination in the US rental market impose damages 
upon Black and Hispanic renters equivalent to about 3 
to 5% of their annual income. Christensen and Timmins 
deployed a bot to scour online apartment listings and 
send 18,000 inquiries about 6000 apartments in five major 
metro regions, varying the names of the “apartment seeker” 
to suggest different racial identities. Non-white identities 
had about an 8% lower likelihood of receiving a response 
that the apartment was available, particularly for neighbor- 
hoods with more white residents, higher rental demand, 
stronger schools, more cafés, fewer murders, and less toxic 


air quality. —BW 


Q.J. Econ. (2023) 10.1093/qje/qjad029 


FORESTRY 
Sharing the forest 
for the trees 


Formalized rights to land can help 
relieve a range of problems, from 
land grabs and tenuous access 

to natural resources to conflicts 


between resource users. In recent 
decades, indigenous groups 

and rural communities have 
pressured many national govern- 
ments to enact land reform, 
granting collective property 
rights to restore communities’ 
historic claims. Kaur et al. found 
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HUMAN GENETICS 
These bones were made 
for walking 


Many skeletal changes occurred 
on the path to modern humans, 
resulting in bipedalism but also 
susceptibility to musculoskeletal 
diseases. Kun et al. used imaging 
data from more than 30,000 UK 
Biobank participants to char- 
acterize skeletal proportions, 
assessing the genetic basis of 
these features, as well as their 
relationships to each other. They 
found that limb proportions are 
uncorrelated with body width 
proportions, that there are 
associations between hip- and 
leg-related skeletal proportions 
and osteoarthritis, and that there 
is enrichment for loci associ- 
ated with skeletal proportion 
in genomic regions associated 
with human-specific evolution. 
This study demonstrates the 
utility of using imaging data from 
biobanks to understand both dis- 
ease-related and normal physical 
variation among humans. —CNS 
Science, adf8009, this issue p. 283 


MICROBIOME 

The yin and yang of 
peptide responses 

To prevent gut microbiota from 
running amok, animals and plants 
secrete a series of small, often 
multifunctional peptides called 
antimicrobial peptides. Until 
recently, antimicrobial peptides 
were considered to have broad 
activities, and it was unclear why 
such molecules showed signs 

of rapid evolution. Hanson et al. 
found a striking specificity for the 
peptides diptericin A and B for 
two species of gut commensal 
bacteria. These species occur 

in the natural environment of 
fruit flies depending on the food 
resource exploited: fruit or fungi. 
Thus, the presence or absence 
of diptericin A or B predicts the 
ecology of the fly. This work 
shows how an organism's micro- 
biota might be able to shape the 
host’s immune responses ina 
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manner similar to how a host's 
immune responses shape its 
microbiota. —CA 

Science, adg5725, this issue p. 284 


HEART DISEASE 
Sleep tight, don’t let the 
immune cells bite 


Patients with heart disease 
frequently present with low 
melatonin levels and show 
disruptions in their sleep—wake 
cycles. Although disordered sleep 
adds considerably to the overall 
disease burden of these patients, 
the mechanisms that under- 
lie this phenomenon remain 
unclear. Ziegler et al. report in 
both mice and humans that sleep 
disruption in cardiac disease is 
driven by the loss of neurons 
that normally project from the 
superior cervical ganglia into 
the pineal gland, which secretes 
melatonin (see the Perspective 
by Davis and Attwell). They found 
that heart disease triggers the 
infiltration of macrophages into 
superior cervical ganglia, where 
they orchestrate neuronal cell 
death. Depletion of macrophages 
or inhibition of their activation 
attenuated these defects in a 
mouse model of heart disease, 
suggesting an actionable target 
for future therapies. — STS 
Science, abn6366, this issue p. 285; 
see also adj0217, p.270 


ENERGY HARVESTING 
Asolar boost for 
thermogalvanics 


In thermogalvanic cells, tem- 
perature-driven concentration 
gradients of redox-active species 
create a potential difference that 
can produce electricity. Wang 

et al. show that for the redox 
coupling between Fe(CN),* and 
Fe(CN),> in a gel matrix, water- 
splitting photocatalysts that 
generated oxygen and hydrogen 
boosted the concentration gradi- 
ents of the redox ions and added 
a proton gradient to the cell (see 
the Perspective by Yu and Duan). 


The cell had a thermopower of 
82 millivolts per degree kelvin 
and also generated a hydrogen 
by-product. —PDS 
Science, adg0O164, this issue p. 291; 
see also adi8036, p. 269 


BIOMATERIALS 
Gentle, precision 
interfacing with the brain 


Interfacing of neural devices 
with the brain enables detailed 
recording and stimulation, but 
there is typically a trade-off 
between the level of invasiveness 
and the resolution of the device. 
Zhang et al. developed a probe 
consisting of an ultrasmall and 
flexible electronic mesh loaded 
onto a flexible microcatheter 
(see the Perspective by Timko). 
Because of its size and flexibility, 
this probe can be implanted into 
100-micrometer-scale blood 
vessels in the brain without 
requiring open-skull surgery 
and without damaging the brain 
or vasculature. The authors 
demonstrated the potential of 
their device by measuring local 
field potentials and single-unit 
spikes in the cortex and olfac- 
tory bulb of arat. The meshes 
also demonstrated long-term 
stability with minimal immune 
response. —MSL 

Science, adh3916 this issue p. 306; 

see also adi9330, p. 268 


MICROBIOTA 
How to get on together 


The question of how organisms 
coexist in communities is a 
pivotal issue for ecologists. To 
answer this question in a natu- 
ral ecosystem would require 
the imposing task of isolating 
and competing all coexisting 
members. To render the ques- 
tion experimentally tractable, 
Chang et al. isolated organisms 
from stable synthetic bacterial 
communities and competed all 
possible combination of pairs 
of organisms to test their ability 
to live together. Competitive 
exclusion occurred in a majority 
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of pairs, whereas a minority 

of pairs coexisted. Therefore, 

species coexistence is in 

part the result of networks of 

interactions and is an emergent 

property of community assem- 

bly, pointing to the importance 

of sustaining biodiversity. —CA 
Science, adg0O727, this issue p. 343 


HOST AND MICROBE 
H. pylori perturbs planar 
cell polarity 


The strongest risk factor for 
gastric cancer is infection with 
Helicobacter pylori strains that 
inject the toxin CagA into gastric 
epithelial cells. Takahashi- 
Kanemitsu et al. report that 
CagA may promote gastric 
cancer by interfering with Wnt- 
mediated planar cell polarity 
(Wnt/PCP) signaling. CagA 
disrupted Wnt/PCP-dependent 
morphogenetic processes in 
frog embryos and stimulated 
the proliferation of pyloric gland 
stem cells in the mouse stom- 
ach. It also interfered with this 
signaling pathway by causing 
the mislocalization of VANGL, 
a core component of the Wnt/ 
PCP signaling complex. —AMV 
Sci. Signal. (2023) 
10.1126/scisignal.abp9020 


GEOPHYSICS 
A predictable rupture 


Unlike some volcanic erup- 
tions, no clear set of precursor 
signals have been identified for 
large earthquakes. Bletery and 
Nocquet analyzed high-rate GPS 
time series before 90 different 
earthquakes that were magnitude 
7 and above to find a precursor 
signal (see the Perspective by 
Burgmann). They observed a 
subtle signal that rose from the 
noise about 2 hours before these 
major earthquakes occurred. This 
work may allow fault monitoring 
for this precursor phase with 
denser and higher-precision 
instrumentation. —BG 

Science, adg2565 this issue p. 297; 

see also adi8032, p. 266 


282-C 


F 


ILLUSTRATION: MICHAEL GLENWOOD GIBBS/ISPOT STOCK 


(HF). Patel et a/. now report 
a safer alternative means of 
activating fluorspar for down- 
stream fluorination chemistry. 
Specifically, they enhanced 
its solubility by grinding it in 
a ball mill with dipotassium 
hydrogen phosphate salt. The 
resultant solid can then be 
used to form carbon-fluorine 
and sulfur—fluorine bonds with 
electrophilic reactants in an 
alcohol solvent. —JSY 

Science, adi1557 this issue p. 302 


AUTOIMMUNITY 
Celiac disease details 


Celiac disease is an autoim- 
mune disease for which the 
triggering antigen, ingested 
gluten, is well defined, thus 
facilitating investigation of 
immunological changes that 
lead to pathology. Kornberg 
et al. performed multiplexed, 
single-cell analysis of intes- 
tinal and peripheral blood T 
cells from patients in dif- 
ferent disease states and 
from healthy controls. They 
identified distinct immune cell 
signatures, including elevated 
CD** follicular T helper cells, 
regulatory T cells, and natural 
CD* aB and yd intraepithelial 
T cells (T-IELs), in patients 
with untreated celiac disease. 
The celiac disease—associ- 
ated T cell receptor repertoire 
observed in natural killer 
receptor—expressing natural 
CD®* af and yd T-IELs showed 
evidence of antigen-mediated 
selection. In response to 
gluten ingestion, a natural 
killer receptor—expressing 
memory T-IEL subset gives rise 
to cytotoxic cells that appear 
to mediate celiac disease— 
associated intestinal damage. 
—CNF 
Sci. Immunol. (2023) 
10.1126/sciimmunol.adf4312 


CANCER 
A sugar-free 


cancer therapy 


Glioblastoma (GBM) is an 
aggressive, uncurable primary 
brain tumor. Immunotherapies 
hold promise as a treatment, 
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but GBM tumors evolve strate- 
gies to evade the immune 
system. One such strategy the 
tumors use is to hypersialylate 
the surfaces of their cells, 
which simultaneously masks 
them from the immune system 
and tolerizes tumor-associated 
macrophages and microglia 
through sialic acid-binding 
immunoglobulin-like lectins 
(Siglecs). Schmassmann et 

al. found that inhibiting the 
Siglec—sialic acid axis either 
genetically or with an anti- 
body delayed tumor growth 

in mouse models of GBM 

and could be combined with 
immune checkpoint inhibitors. 
Siglec-sialic axis disruption 
further led to immune activa- 
tion and tumor cell death in 
patient-derived GBM explants. 
These encouraging results 
support further study of 

the Siglec-—sialic axis in the 
context of human GBM and its 
potential clinical translation. 
—CSM 


Sci. Transl. Med. (2023) 
10.1126/scitranslmed.adf5302 


CORONAVIRUS 
Anatomy of an epidemic 


The severe acute respiratory 
syndrome coronavirus 2 pan- 
demic changed character with 
the emergence of the Omicron 
lineage in South Africa in 
2021. This lineage showed 
elevated transmissibility and 
increased immune evasion. 
Tsui et al. traced the history of 
the introduction of this variant 
in the United Kingdom, where 
exceptionally comprehensive 
genetic sampling regimes 
were established. The authors 
found that the virus had been 
introduced undetected into 
England between 5 and 18 
November 2021, and South 
African scientists alerted the 
World Health Organization on 
22 November 2021. However, 
by the time the UK government 
had responded, the variant had 
already spread between UK 
cities and globally. Therefore, 
the subsequent travel restric- 
tions placed on southern 
Africa were futile. —CA 

Science, adg6605, this issue p. 336 


ECONOMICS 


The price of housing discrimination 


onstraints on housing choices caused by racial dis- 
crimination in the US rental market impose damages 
upon Black and Hispanic renters equivalent to about 3 
to 5% of their annual income. Christensen and Timmins 
deployed a bot to scour online apartment listings and 
send 18,000 inquiries about 6000 apartments in five major 
metro regions, varying the names of the “apartment seeker” 
to suggest different racial identities. Non-white identities 
had about an 8% lower likelihood of receiving a response 
that the apartment was available, particularly for neighbor- 
hoods with more white residents, higher rental demand, 
stronger schools, more cafés, fewer murders, and less toxic 


air quality. —BW 


Q.J. Econ. (2023) 10.1093/qje/qjad029 


FORESTRY 
Sharing the forest 
for the trees 


Formalized rights to land can help 
relieve a range of problems, from 
land grabs and tenuous access 

to natural resources to conflicts 


between resource users. In recent 
decades, indigenous groups 

and rural communities have 
pressured many national govern- 
ments to enact land reform, 
granting collective property 
rights to restore communities’ 
historic claims. Kaur et al. found 
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that collective rights encourage 
collective behaviors, helping to 
explain reports that shared prop- 
erty rights improve economic and 
environmental outcomes. Cross- 
sectional data from 10 countries 
showed that forest user groups 
with collective land rights had 
greater interpersonal interaction 
and cooperation. In a common- 
pool resource game with forest 
users in Uganda and Bolivia, 
groups with collective land rights 
had greater communal trust and 
amore equitable harvest. —BEL 
Conserv. Lett. (2023) 
10.1111/con1.12950 


IMMUNOLOGY 
Executing food tolerance 


Animals need to induce immune 
responses against pathogens 
while retaining immune toler- 
ance toward commensals and 
food in the gut, but how this 
process is regulated was unclear. 
He et al. studied one regula- 

tory candidate, the pyroptosis 
executioner protein gasdermin D 
(GSDMD), in intestinal epithe- 
lial cells. Removing GSDMD 
disrupted immune tolerance 

to food in the small intestine. A 
30-kilodalton GSDMD cleavage 
fragment executed pyroptosis by 
translocating from the cytosol 
and rupturing cell membranes. 


282 


Genetic screening of 
newborns with a simple 
heel-prick test can also 

be helpful for healthy 
children and their families. 


However, in intestinal epithelial 
cells, the authors found a dif- 
ferent, 13-kilodalton N-terminal 
GSDMD cleavage fragment 
translocated to the nucleus, 
where it induced the transcrip- 
tion of major histocompatibility 
complex class Il (MHCII) mol- 
ecules. In turn, this process 
induced type 1 regulatory T cells 
in the upper small intestine that 
endow food tolerance. Impairing 
these processes in mice dam- 
aged MHCIl expression and led 
to food intolerance. Therefore, in 
the small intestine, the differen- 
tial cleavage of GSDMD controls 
the balance between immune 
responsiveness and tolerance. 
—SMH 
Cell (2023) 
10.1016/j.cell.2023.05.027 


BORON CHEMISTRY 
Decoding boron 


monoxide’s structure 


The synthesis of boron monoxide 
was reported in the mid-1950s, 
but its hard, brittle nature has 
made experimental structural 
studies difficult. Of the many 
theoretically proposed struc- 
tures, the ones based on boroxine 
(B,0,) rings have been favored. 
Perras et al. report a solid-state 
boron-11 nuclear magnetic 
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screening. —YN 


resonance study in which 
three-spin triple-quantum single- 
quantum correlation experiments 
revealed the presence of O,B- 
BO, units that formed larger 
B,O, rings versus boroxine rings. 
Powder x-ray diffraction showed 
that these structures contained 
two-dimensional layers that stack 
randomly. —PDS 

J.Am. Chem. Soc. (2023) 

10.1021/jacs.3c02070 


QUANTUM CHEMISTRY 
A joint approach to 
solvation modeling 


Rigorous atomic-scale simula- 
tions of solvation, the interaction 
of a solvent with a dissolved 
solute, remains one of the key 
challenges in computational 
chemistry and physics, and 

is complicated by a nontrivial 
hydrogen-bonding network in the 
case of water or similar solvents. 
Using rare-event sampling within 
density functional theory and 
molecular dynamics combined 
with density-based embedded 
correlated wavefunction theory, 
Martirez and Carter developed a 
multilevel dynamic scheme that 
achieved excellent agreement 
between theory and experiment 
for the dissolution of carbon diox- 
ide in water and its hydration to 


DISEASE GENETICS 
Baby sequencing helps parents 


enomic sequencing may uncover informa- 
tion that may or may not be actionable. 
Because of this, and because of the costs 
involved, newborns’ genomes are usually 
sequenced only in cases of severe dis- 
ease or if clinicians suspect a genetic disorder. 
However, a study by Green et al. suggests that 
sequencing may be useful even for healthy chil- 
dren. The authors performed genomic sequencing 
on 159 infants, 32 of whom were in intensive care 
and the rest apparently healthy newborns. In six 
sick infants and 11 healthy ones, the researchers 
uncovered clinically important genetic condi- 
tions, some of which yielded actionable clinical 
information, not only for the infants but also for 
their families, showing the potential value of such 


Am. J. Hum. Genet. (2023) 10.1016/j.ajhg.2023.05.007 


carbonic acid. Such a combined 
approach, which uses both static 
and dynamic quantum mechani- 
cal simulations, represents 
a promising step toward the 
accurate description of solvation 
dynamics. —YS 

J.Am. Chem. Soc. (2023) 

10.1021/jacs.3c01283 


COGNITION 
Acheckpoint to 
remember 


The immune checkpoint 
regulator programmed cell death 
protein 1 (PD-1) is best known 
for its role as a target in cancer 
immunotherapy. In the brain, 
PD-1is expressed by neuronal 
and non-neuronal populations, 
but its roles remain to be fully 
elucidated. Zhao et al. found 
that depletion of neuronal PD-1 
in the hippocampus increased 
synaptic plasticity and cognitive 
performance in wild-type mice. 
In addition, pharmacological 
PD-1 inhibition restored spatial 
memory after traumatic brain 
injury in mice, indicating that 
inhibiting neuronal PD-1 signaling 
might be effective in promoting 
memory performance in health 
and disease. -MMa 
Neuron (2023) 
10.1016/j.neuron.2023.05.022 
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INTRODUCTION: Humans are the only bipedal 
great apes, owing to our distinctive skeletal form. 
Morphological changes that contribute to our 
skeletal form have been studied extensively in 
paleoanthropology. With the exception of stand- 
ing height, examining the genetic basis for dif- 
ferential and specific growth of individual bones 
and their evolution has been challenging be- 
cause of limited sample sizes. 


A Skeletal proportions (SPs) computed from DXA 
scans of 31,221 individuals 


o Landmarks 
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Torso length 
Hip width 
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Height 


*length measures analyzed as 
proportions of overall height 
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RATIONALE: One approach to studying skeletal 
form is to obtain a map of regions in the ge- 
nome that affect skeletal development and mor- 
phology. Previously, this has been examined 
mainly through animal models and compara- 
tive genomics, but these approaches are largely 
low throughput. A complementary approach 
is to examine the genetic basis of variation in 
skeletal traits in humans. In this work, we ap- 


B Genomic locations of SP-associated loci 


D Genes associated with SPs show evidence 
for accelerated evolution in humans 


Hip width:height 


dual-energy x-ray absorptiometry (DXA) imL5-- 
from the UK Biobank to extract 23 different 
image-derived phenotypes that include all long- 
bone lengths and hip and shoulder widths, 
which we analyzed while controlling for height. 


RESULTS: All skeletal proportions (SPs) are high- 
ly heritable (~30 to 50%), and genome-wide as- 
sociation studies of these traits identified 145 
independent loci. These loci are enriched in genes 
that regulate skeletal development as well as those 
that are associated with rare human skeletal dis- 
eases and abnormal mouse skeletal phenotypes. 
Genetic correlation and genomic structural equa- 
tion modeling indicated that limb proportions 
exhibited strong genetic sharing but were genet- 
ically independent of width and torso proportions. 
Phenotypic and polygenic risk score analyses 
identified specific associations between osteo- 
arthritis of the hip and knee, which are the lead- 
ing causes of adult disability in the United States, 
and SPs of the corresponding regions. We also 
found genomic evidence of evolutionary change 
in arm-to-leg and hip-width proportions in hu- 
mans, consistent with notable anatomical changes 
in these SPs in the hominin fossil record. In con- 
trast to cardiovascular, autoimmune, metabolic, 
and other categories of traits, loci associated 
with these SPs are significantly enriched both in 
human accelerated regions and in regulatory 
elements of genes that are differentially ex- 
pressed in humans and the great apes through- 
out development. 


CONCLUSION: Our work validates the use of deep- 
learning models on DXA images to identify spe- 
cific genetic variants that affect the human skeletal 
form. It also ties a major evolutionary facet of 
human anatomical change to pathogenesis. 
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The genetic basis, evolution, and health con- 
sequences of human skeletal traits. (A) Measure- 
ment of SPs using a deep learning—based landmark 
estimation method on full-body DXAs. (B) Location of 
loci that localize to a single protein-coding gene 
and are associated with various SPs, colored according 
to the scheme in (A). (C) Significant phenotypic and 
genetic associations of various SPs with musculo- 
skeletal disease or joint pain. Number notations in 
parentheses are the ICD-10 (International Classification 
of Diseases, Tenth Revision) codes associated with each 
disease. OA, osteoarthritis; TFA, tibiofemoral angle. 
(D) SPs with genomic evidence of human-specific 
evolution. Illustration was created with 
BioRender.com. 
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The human skeletal form underlies bipedalism, but the genetic basis of skeletal proportions (SPs) 

is not well characterized. We applied deep-learning models to 31,221 x-rays from the UK Biobank 

to extract a comprehensive set of SPs, which were associated with 145 independent loci 
genome-wide. Structural equation modeling suggested that limb proportions exhibited strong genetic 
sharing but were independent of width and torso proportions. Polygenic score analysis identified 
specific associations between osteoarthritis and hip and knee SPs. In contrast to other traits, 

SP loci were enriched in human accelerated regions and in regulatory elements of genes that are 
differentially expressed between humans and great apes. Combined, our work identifies specific 
genetic variants that affect the skeletal form and ties a major evolutionary facet of human 


anatomical change to pathogenesis. 


umans are the only primates who are 
normally bipedal, owing to our distinc- 

tive skeletal form, which stabilizes the 
upright position. Bipedalism is enabled 

by specific anatomical properties of the 
human skeleton, including shorter arms rela- 
tive to legs, a narrow body and pelvis, and the 
orientation of the vertebral column (7-3). These 
broad changes to skeletal proportions (SPs) 
likely began to occur around the separation 
of the human and chimpanzee lineages, and 
as aresult, may have facilitated the use of tools 
and accelerated cognitive development (4, 5). 
Fossil evidence showing major morphological 
changes in the length of the limbs, torso, and 
body width suggest that these changes were 
gradual, with incremental development over 
the course of several million years (6, 7). However, 
despite more than a hundred years of effort in 
paleoanthropology documenting morphological 
changes of the skeletal form in human evolution, 
evidence of genomic change has been elusive. 
In developmental biology, the mechanisms 
and processes underlying animal limb develop- 
ment, morphology, and broad body plan have 
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been studied extensively. Early work using 
forward genetic screens in Drosophila iden- 
tified homeobox genes as key regulators of 
anatomical development in invertebrates (8). 
Subsequent experiments in vertebrates, includ- 
ing fish, chickens, and mice, identified addition- 
al gene families that are crucial in the regulation 
of skeletal development and form (9, 10). 
Comparative genomic and evolutionary devel- 
opmental biology approaches have produced 
several insights into the genetic basis of skel- 
etal structure, from the underpinnings of conver- 
gent limb loss in snakes and limbless lizards 
(11, 12) to increased limb lengths in jerboas 
when compared with mice (13). However, these 
approaches do not provide an unbiased and 
comprehensive map of the genetic loci that 
regulate SPs and overall body plan. In addition, 
many of these approaches largely focus on 
examining the impact of loss-of-function muta- 
tions, which often have widespread effects on 
the entire skeleton. The subset of genes re- 
sponsible for differential and specific growth 
of individual bones remains unknown. 
Genome-wide association studies (GWASs) 
of human skeletal traits are a direct and comple- 
mentary approach to characterizing the genetic 
basis of traits. Twin studies suggest that the 
heritability of SPs range between 0.40 and 0.80 
(14), similar to the heritability of standing 
height (15), a skeletal trait that has served as an 
exemplary quantitative trait in human genetics. 
Meta-analysis of more than 5 million individu- 
als has identified a saturated map of common 
genetic variants that are associated with stand- 
ing height (J6). However, height is among the 
most straightforward and accurate of quantita- 
tive traits to measure. Other skeletal elements, 
such as limb, torso, and shoulder lengths, are 


not typically or comprehensively measured in 
large sample sizes (17, 18). As a result, the 
genetic basis of such proportions and lengths 
remains understudied. Furthermore, anthropo- 
metric traits, like hip and waist circumferences, 
are measured externally and therefore are in- 
trinsically tied to body-fat percentage and dis- 
tribution, which fails to isolate genetic effects 
specific to the skeletal frame (19, 20). 

Applying deep-learning methods to non- 
invasive medical imaging is a powerful way 
to extract skeletal measures in an accurate and 
scalable manner. Furthermore, the collection 
of genetic, phenotypic, and imaging data by 
national biobanks provides an opportunity to 
run GWASs for image-derived phenotypes (IDPs) 
with sufficiently large sample sizes. Several ge- 
netic studies have successfully applied com- 
puter vision to generate IDPs of the retina, 
distribution of body fat, heart structure, and 
liver-fat percentage and have linked signifi- _ 
cant loci to various disorders (21-24). 

In the context of musculoskeletal disease, 
epidemiological data suggest that disorders 
such as osteoarthritis (OA), the leading cause of 
adult disability in the United States (25, 26), 
are thought to be influenced by a variety of risk 
factors that range across obesity, mechanical 
stresses, genetic factors, and even the geometric 
structure of certain bones (27). Although some 
small studies have examined the relationship 
of certain skeletal element lengths such as 
leg-length discrepancy and OA (28), how the 
skeletal frame may exacerbate an individual's 
development of osteoarthritic disease has not 
been fully studied (27, 29). 

In this study, we applied methods in com- 
puter vision to derive comprehensive human 
skeletal measurements from full-body dual- 
energy x-ray absorptiometry (DXA) images at 
biobank scale. We then performed genome- 
wide scans on 23 generated phenotypes to 
identify loci associated with variation in the 
skeletal form. Using summary statistics from 
these IDPs, we identified biological processes 
linked with human SPs and studied the pheno- 
typic and genetic correlation between these 
measures and a range of external phenotypes, 
with an emphasis on musculoskeletal dis- 
orders. Finally, we investigated the impact of 
natural selection on these traits to understand 
how skeletal morphology is linked to human 
evolution and bipedalism. 


Results 
A deep-learning approach for quality control and 
quantification of biobank-scale imaging data 


To study the genetic basis of human SPs, we 
jointly analyzed DXA and genetic data from 
42,284 individuals in the UK Biobank (UKB). 
Individuals from this dataset are between 40 
and 80 years old and reflect adult skeletal 
morphology. We report baseline information 
about our analyzed cohort in (30) and in table 
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Sl. We acquired 328,854 DXA scan images 
across eight imaging modalities comprising 
full-body transparent images, full-body opaque 
images, anteroposterior (AP) views of the left 
and right knees, AP views of the hips, and AP 
and lateral views of the spine. For quality con- 
trol (QC), we first developed a deep learning- 
based multiclass predictor to select full-body 
transparent images from the pool of eight total 
imaging modalities. We developed a second 
deep-learning classifier to remove cropping 
artifacts. Finally, we excluded images with atyp- 
ical aspect ratios and padded them to uniform 
sizes (30) (Fig. 1A). After our QC process, we 
were left with 39,469 images for analysis. 

After image QC, we manually labeled 14 
landmarks at pixel-level resolution on 297 
images for use as training data. These labels 
were independently validated by an orthope- 
dic team. The 14 landmarks include the major 
joints—the wrist, elbow, shoulder, hip, knee, 
and ankle—and the position of each eye. The 
segments connecting these landmarks reflect 
natural measurements for long-bone lengths 
or body-width measures. We assessed the rep- 
licability of manual annotation by inserting 20 
duplicated images from the 297 training images 
without the knowledge of the annotator and 
found that repeat measurements resulted in a 
difference of less than 2 pixels at any landmark 
(30) (Fig. 1B). 

We adapted and applied a new computer vi- 
sion architecture, High-Resolution Net (HRNet), 
for landmark estimation, or the prediction of the 
location of human joints (37). There are four 
main reasons why we chose HRNet. First, 
HRNet maintains high-resolution represen- 
tations throughout the model (30), and we 
wanted to use the high-resolution medical 
images produced by the DXA scanner to ob- 
tain precise measurement information of bone 
lengths. Second, the architecture had already 
been trained on two large imaging datasets, 
first on imageNet (32), a general natural image 
dataset, and then subsequently on Common 
Objects in Context (COCO) (33), a dataset of 
more than 200,000 images of humans in natu- 
ral settings with joint landmarks classified. 
These two previous layers of training enabled 
us to perform transfer learning to fine-tune 
the architecture on our training data and reduce 
the total amount of manual annotation to just 
297 images. Third, HRNet has among the best 
performance for a similar task of labeling 
human joints on two large-scale benchmarking 
datasets of human subjects (33, 34). Finally, we 
directly compared the performance of the 
HRnet architecture with a more traditional 
architecture on our dataset (ResNet-34) (35) 
and obtained significantly better results across 
different training parameter choices (30) (table 
$2). Upon training, the model achieved greater 
than 95% average precision on hold-out vali- 


dation data across all body parts (table $2). 
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Validation of human skeletal length estimates 
After training and validating the deep-learning 
model on the 297 manually annotated images, 
we applied this model to predict the 14 land- 
marks on the rest of the 39,172 full-body DXA 
images. We then calculated pixel distances 
between pairs of landmarks that corresponded 
to seven bone and body-length segments (30) 
(Fig. 1B and table S3). We also computed an 
angle measure between the tibia and the femur 
(tibiofemoral angle, or TFA) (Fig. 1B). To stan- 
dardize images with different aspect ratios, we 
rescaled pixels into centimeters for each image 
resolution by regressing the height in pixels 
against standing height in centimeters as mea- 
sured by the UKB assessment (30). We then 
removed individuals with any skeletal measure- 
ments that were more than four standard de- 
viations from the mean. 

After outlier removal, we validated the accu- 
racy of our measurements on the remaining 
samples in four ways. First, the error rate for 
segment length from the model compared with 
manual annotation was, at maximum, 3 pixels 
or 0.7 cm, which is similar to the variation from 
manual annotation of the 20 duplicate images. 
Reliability (100% variance in measurement di- 
vided by variance of a segment length) was 
greater than 95% across all length measures 
(30) (Fig. 1C and tables S4: to S6). Second, the 
correlation between long-bone lengths and 
height as measured in the UKB was around 
~0.88, which falls within the expectation ob- 
served in the literature (17) (Fig. 1D). Third, the 
correlation between left and right limb lengths 
was greater than 0.99 (Fig. IE). Fourth, a sub- 
set of 667 individuals had undergone repeat 
imaging an average of 2 years apart, with dif- 
ferent image aspect ratios, DXA machines, soft- 
ware models, and technicians carrying out the 
imaging (Fig. 1F). The correlation in these tech- 
nical replicates across skeletal elements was also 
greater than 0.99. Taken together, these results 
suggest that the IDPs from our deep-learning 
model are highly accurate and highly replicable. 


Characteristics and correlations of human SPs 
with sex, age, and height 


From the seven bone and body length segments, 
we examined these IDPs as proportions instead 
of lengths (or to control for variation in overall 
height, which is highly correlated with each of 
these lengths) by taking simple ratios of each 
IDP with overall height (30) (Fig. 1B). We also 
carried out this normalization analysis in 
alternate ways, including using height as a co- 
variate in association tests as well as regressing 
each IDP with height and obtaining residuals. 
All three approaches were highly correlated, 
and we used the simple approach of taking 
proportions for most analyses (30). As expected, 
this greatly reduced the overall correlation of 
our traits with height (table S7). In addition to 
obtaining ratios of each segment length with 


overall height, we also computed ratios of seg- 
ments with each other and obtained a total of 
21 different ratio IDPs along with the angle 
measure (TFA) (table S3). These ratios are 
referred to in the text as Segment:Segment (Hip 
Width:Height, Torso Length:Legs, and so on). 

We then examined differences in SPs across 
sex and age. In line with well-known observations, 
Hip Width:Height (Student’s ¢ test p < 10°”) 
and Torso Length:Height (Student’s ¢ test p < 
10 *°) were significantly larger in women than in 
men (36), but we also observed that Humerus: 
Height was also significantly larger in women 
than in men (Student’s ¢ test p = 1.45 x 10~) (30) 
(table S8). In addition, we found that all body 
proportions vary slightly but significantly as a 
function of age (30) (table S9). We also exam- 
ined how body proportions vary as a function of 
overall height and found that Torso Length:Legs 
decreases with height [Pearson correlation (7) = 
0.21], suggesting that increases in height are | 
driven more by increasing leg length rather than 
torso length (Fig. 2A). Arms:Legs also decreases 
with height (7 = —0.02), meaning that leg length 
also outpaces arm length as height increases. 
Within each limb, for both arms and legs, lower 
to upper limb ratios (Tibia:Femur, Forearm: 
Humerus) increase with overall limb length. 
These increases also correspond with correla- 
tions with height, with Tibia: Femur increasing 
when height increases (r = 0.12). 


GWASs of human SPs 


We performed GWASs using imputed geno- 
type data in the UKB to identify variants asso- 
ciated with each skeletal measure. We applied 
standard variant and sample QC and focused 
our analyses on 31,221 individuals of white 
British ancestry, as determined by the UKB 
genetic assessment, and 7.4 million common 
biallelic single-nucleotide polymorphisms (SNPs) 
with minor allele frequency >1% (30, 37) 
(tables S1 and S10). We used BOLT-LMM (38) 
to regress variants on each skeletal measure 
using a linear mixed-model association frame- 
work. After generating summary statistics for 
each skeletal measure, we estimated SNP 
heritability using LD Score regression (LDSC) 
(39) and GCTA-REML (40). All traits were highly 
heritable, with SNP heritability between 23 
and 53% for LDSC and between 17 and 50% 
for GCTA-REML (tables S11 and S12). We de- 
tected inflation in test statistics in our quantile- 
quantile (QQ) plots (mean inflation, 4 = 1.20); 
however, minimal deviation of univariate LDSC 
intercepts from 1.0 suggested that this inflation 
was consistent with polygenicity rather than 
confounding (30) (Fig. 3B). 

In the seven SPs as a ratio of height (Forearm: 
Height, Humerus:Height, Tibia:Height, Femur: 
Height, Hip Width:Height, Shoulder Width:Height, 
Torso Length:Height) and TFA, we identified 
223 loci at p< 5 x 10° and 150 loci at p < 6.25 x 10° 
(Bonferroni correction for eight traits). Of these 
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DXA images of different body parts, as well as to remove images with artifacts, | measurements were calculated. (©) Average HRNet measurement error when 
resolution, or cropping issues. Full-body images were then padded to standardize | compared with human-derived measurements of the tibia across 100 validation 
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5 pixels on each side). (B) Image quantification. Deep learning—based image between left- and right-side measurements of the femur, humerus, forearm, 
landmark estimation using the HRNet architecture is shown. During this process, —_ and tibia. (F) Correlation of lengths measured from the first and second 
297 training images annotated with specific landmarks were used to train the imaging visits for the same individual. 
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Fig. 2. Genetic architecture. (A) Correlation of SPs and overall height. Bars show +2 SE. (B) Genotype (lower-left triangle) and phenotype (upper-right triangle) 
correlation of SPs. Overall correlation is shown in color, and the p value of the correlation is visualized by size. A Bonferroni-corrected threshold is also shown. 
(C) Solution for a genomic SEM model for the genetic covariance structure shown in (B) shows one common factor loading for arms, an additional factor for legs, 
and independent factors for each of the torso-related traits (hip width, shoulder width, and torso length). (D) Sex-specific analysis showing the ratio of the 
standardized effect size of the polygenic score on each trait (t2 SE) in males to the effect in females in a hold-out dataset. 


loci, 145 are independently significant [linkage 
disequilibrium (7) < 0.1] across all eight pheno- 
types (92 after Bonferroni correction for eight 
traits). Of the 145 independent loci, 37 loci are 
only significant in SPs after conditioning on 
all SNPs discovered in a saturated GWAS for 
height (6, 30) (table S13). As a sensitivity anal- 
ysis, we also examined the genetic effect of 


ment and found that 95% of genome-wide 
significant loci had the same direction of effect 
when carrying out GWASs in these alternate 
ways (30). 


Genetic correlations and factor analysis of SPs 


We calculated the genetic correlation between 
each pair of traits to investigate the degree of 


skeletal lengths before and after height adjust- 
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genetic sharing between each skeletal measure. 


Estimates from LDSC and GCTA-REML were 
virtually identical (fig. S10); in this work, we 
report estimates from GCTA-REML. Limb pro- 
portions had positive genetic correlations with 
each other (7 = 0.34 to 0.55). Upper arms and 
legs (Humerus:Height-Femur:Height 7, = 0.55, 
p = 159 x 10) and lower arms and legs 
(Forearm: Height-Tibia:Height 7, = 0.51, p = 
601 x 10”) were significantly more correlated 
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Fig. 3. Genome-wide association results. (A) Manhattan plot of a GWAS performed across seven SPs and TFA; the lowest p value for any trait at each SNP is annotated. 
Loci over the genome-wide significance threshold that are close to only a single gene are annotated. (B) Shown are mean values of proportion and angle traits across 
individuals, the total number of genome-wide significant loci per trait, heritability (GCTA-REML), A (from LDSC), and associated genes of loci that are specific to each skeletal 
trait (again annotating only loci that map to a region with a protein-coding gene within 20 kb of each clumped region). Illustration was created with BioRender.com. 


than upper arms and lower legs (Humerus: 
Height-Tibia:Height 7, = 0.38, p = 5.18 x 10°”) 
or lower arms and upper legs (Forearm:Height- 
Femur:Height rz = 0.34, p = 1.49 x 107"). Body- 
width proportions, Hip Width:Height and Shoul- 
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der Width:Height, were largely uncorrelated 
with limb-length proportions (30). No corre- 
lations involving any pairwise combination of 
arm and width traits were significant (the 
minimum p value across all such correlations 


was =0.0022, which was above our Bonferroni 
threshold). Correlations between leg and width 
traits were marginally significant in three out of 
four comparisons, with the maximal correla- 
tion (Hip Width:Height-Tibia:Height) being 
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0.23 (Fig. 2B and table S13). In addition, we 
also computed phenotypic correlations between 
our traits, which were highly concordant with 
genetic correlations (r = 0.98). 

We used genomic structural equation model- 
ing (genomic SEM) to produce an empirically 
derived low-dimensional representation of the 
genetic covariance structure of the individual 
SPs (41). We performed exploratory factor analy- 
sis to identify the likely number of factors and 
built confirmatory models using odd-numbered 
chromosomes for model building and even- 
numbered chromosomes for validation, which 
we compared using a range of model fit indices 
(30). Our preferred model of the genetic co- 
variance structure revealed five main factors 
that governed SPs. First, we identified a single 
broad factor (Skeletal factor) that represents 
dimensions of genetic variation that are sta- 
tistically pleiotropic; that is, genetic variation 
represented in each factor contributes to varia- 
tion in not just one phenotype but to variation 
in multiple phenotypes (30). All limb traits [both 
arms (Humerus:Height and Forearm:Height) 
and legs (Femur:Height and Tibia:Height)] 
load positively on this general Skeletal factor 
(on which Torso Length:Height loads nega- 
tively), but the arm traits additionally load ona 
second factor. Torso length and body-width 
traits (Hip Width:Height and Shoulder Width: 
Height) only load appreciably on trait-specific 
factors (Fig. 2C). That torso and body-width 
proportions do not load appreciably on either 
the general Skeletal factor or the Arm factor 
reinforces our observations from the pairwise 
bivariate genetic correlation analysis in which 
arm and leg proportions were largely inde- 
pendent of torso and body-width proportions. 
Moreover, the genomic SEM results produce 
insights that inspection of the genetic correla- 
tion matrix by itself does not (30). We find that 
genetic sharing between the two components 
of leg length (femur and tibia) does not repre- 
sent genetic variation specific to leg growth per 
se but rather represents a more general dimen- 
sion of genetic variation shared with the upper 
limbs (forearm and humerus). By contrast, the 
upper limbs specifically share genetic variation 
with one another (as indexed by the Arm fac- 
tor) above and beyond a more general dimen- 
sion of skeletal limb proportions (30). 


Sex-specific heritabilities and genetic 
effects of SPs 


Anthropometric and skeletal traits, such as hip 
width, are common examples of sexual dimor- 
phism. We found that for most traits, the 
genetic correlation of SPs between males and 
females was not statistically different from a 
value of one except for TFA (rz = 0.89) (30) (fig. 
S16). For five out of the seven SPs, both of the 
sex-specific SNP heritabilities were greater 
than the heritability estimated jointly with 


both sexes (fig. S17). 
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To test for pervasive differences in the mag- 
nitude of genetic effects, we performed sex- 
specific GWASs of all the skeletal traits and 
evaluated these polygenic scores in both sexes 
in a hold-out dataset (30). This method had 
recently been applied to examine sex-specific 
effects in biobank traits (42). Across all SPs 
that we tested, polygenic scores had a signi- 
ficantly larger standardized effect size (standard- 
ized in males and females separately) in males 
compared with females (Student’s ¢ test p < 1 x 
10 ° for all comparisons) (Fig. 2D). These results 
are in line with previous work suggesting that 
SPs, like other anthropometric traits, have clear 
differences in the magnitude of sex-specific ef- 
fects when compared with other quantitative 
traits in the UKB (42). 


Biological insights from skeletal associations 


We performed gene set enrichment analyses 
in 10,678 gene sets using functional mapping 
and annotation (FUMA) of GWASs to identify 
biological processes and pathways enriched in 
each skeletal trait (30, 43). After false discovery 
rate (FDR) correction (FDR < 0.05), we found 
195 gene sets to be significantly enriched 
across our seven skeletal traits. Several gene 
sets related to development were common across 
most traits, such as skeletal system develop- 
ment, connective tissue development, chondro- 
cyte differentiation, and cartilage development 
(table S15). 

Furthermore, common alleles associated with 
SPs were significantly enriched in 701 auto- 
somal genes linked to “skeletal growth abnor- 
mality” in the Online Mendelian Inheritance in 
Man (OMIM) (44) database (p < 5.0 x 10°”) ex- 
cept genes associated with torso length (p = 
0.22) (tables S16 and S17). Combined, these 
results indicate that common variants asso- 
ciated with SPs pinpointed genes in which rare 
coding variants contribute to Mendelian mus- 
culoskeletal disorders. To determine if loci 
discovered in our GWASs had been impli- 
cated in previous genetic studies, we queried 
the GWAS catalog (45) for each of the 145 in- 
dependent SNPs in our study. As expected, the 
largest overlaps were seen with anthropomet- 
ric traits (table S18). 

Out of the total loci identified across GWASs 
(table S18), 45 loci overlapped a single protein- 
coding gene within 20 kb of each clumped 
region. Notably, of these 45 genes, 32 (or 71%) 
resulted in abnormal skeletal phenotypes when 
disrupted in mice using the Human-Mouse 
Disease Connection database (46). Four of 
these genes (COLIIAI, SOX9, FINI, and AGDRD6) 
were associated with rare skeletal diseases in 
humans, as annotated in OMIM (table $20). In 
some cases, a gene linked with a specific SP in 
our GWASs resulted in a defect in the same 
skeletal trait in mouse models. We found that 
a common variant (rs6546231) near MEISI, a 
homeodomain transcription factor, is associated 


with increased Forearm:Height. Mouse models 
of MEIS~ mice are specifically associated with 
abnormal forelimb development (47). Similarly, 
a common variant (rs1891308) near ADGRG6, 
which encodes a G protein-coupled receptor, is 
associated with increased torso length. Mice 
with conditional knockouts in ADGRG6 have 
spine abnormalities that reduce torso length 
(48). Thus, our GWAS of SPs identifies genes 
that were previously associated with skeletal 
developmental biology and Mendelian skeletal 
phenotypes, demonstrating the potential for 
future functional and knockout studies. 

Next, we conducted a transcriptome-wide 
association study (TWAS) that linked predicted 
gene expression in skeletal muscle [based on 
the Genotype-Tissue Expression project (GTEx v.7) 
(49)] with our SP GWAS. In total, we identified 
30 genes that were significantly associated with 
any one of our skeletal traits at a Bonferroni- 
corrected significance threshold across the total _ 
number of gene and trait combinations (30) 
(table S21). Among the strongest TWAS asso- 
ciations were PAX1 (TWAS -score = 12.6, p = 
1.31 x 10~*), a transcription factor that is criti- 
cal in fetal development and is associated with 
development of the vertebral column, and FGFR3 
(TWAS z-score = 6.5, p = 8.52 x 10), a fibroblast 
growth factor receptor that plays a role in 
bone development and maintenance. 


Genetic and phenotypic association of skeletal 
phenotypes with musculoskeletal disease 


To investigate the clinical relevance of human 
SPs, we examined their genetic and pheno- 
typic associations with musculoskeletal disease 
and with joint and back pain. We used logistic 
regression to examine phenotypic associations 
between skeletal morphology and these muscu- 
loskeletal disorders (Fig. 4A) while controlling 
for age, sex, bone-mineral density, body mass 
index (BMD), and other major risk factors for 
OA (50). We found that one standard devia- 
tion in Hip Width:Height was associated with 
increased odds of hip OA [p = 3.16 x 10~°, odds 
ratio (OR) = 1.34]. Similarly, Femur:Height, 
Tibia: Height, and the TFA, which are all skel- 
etal measures of the knee joint, were associated 
with increased risk of knee OA (p = 2.24 x 10°”, 
OR = 1.345 p = 6.09 x 10°, OR = 1.16; p = 1.64 x 
10”, OR = 149). Femur:Height and the TFA 
were also significantly associated with internal 
derangement of the knee (p = 4.03 x 10%, OR = 
1.19; p = 1.43 x 10”, OR = 1.34). Pain phenotypes 
for hip and knee joints were also associated with 
the specific SPs that make up each joint (hip 
pain with Hip Width:Height: p = 8.53 x 10°, 
OR = 1.12; knee pain with Femur:Height, Tibia: 
Height, and TFA: p = 8.13 x 10°, OR = 1.09; p = 
2.89 x 107°, OR = 1.09; p = 1.66 x 10°*°, OR = 
1.31) (30) (Fig. 4A) (table $22). 

Next, we analyzed 361,140 UKB participants 
who had not undergone DXA imaging and 
were of white British ancestry for predictive 
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Fig. 4. Association between skeletal traits and musculoskeletal disease. (A) Phenotypic associations from logistic regression analysis of musculoskeletal disease 
traits on skeletal phenotypes. (B) Polygenic risk score associations between musculoskeletal disease traits and skeletal phenotypes. For both (A) and (B), associations that 
are significant after Bonferroni correction are annotated with an asterisk. ORs for the phenotypic associations and polygenic risk scores are shown in different colors, 

and the p values are represented by size. The number notations in parentheses are the ICD-10 codes associated with each disease; M54 —Dorsalgia, M16 —Coxarthrosis 
(arthrosis of hip), M17 —Gonarthrosis (arthrosis of knee), and M23 -Internal derangement of knee. 


risk based on polygenic scores derived from 
our GWAS on SPs on the imaged set of indi- 
viduals (Fig. 4B). We generated polygenic scores 
with Bayesian regression and continuous shrink- 
age priors (57) using the significantly associated 
SNPs and ran a phenome-wide association study 
of the generated risk scores and traits, adjusting 
for the first 20 principal components of ancestry 
and imputed sex (30). Polygenic scores of Hip 
Width:Height and TFA were associated with 
an increased incidence of hip and knee OA, 
respectively (p = 7.92 x 10°, OR = 1.04; p = 1:73 x 
10~*, OR = 1.04), in line with the phenotypic 
associations. In addition, we also saw signif- 
icant association between back pain [both 
recorded on the ICD-10 (International Classi- 
fication of Diseases, Tenth Revision) code and 
self-reported] and Torso Length:Height (p = 
5.59 x 10°°, OR = 1.05; p = 5.71 x 10°, OR = 
1.02) (table $23). Neither the OA nor the mus- 
culoskeletal pain phenotypes that we tested 
were significantly associated with overall height 
in this analysis [phenotypic associations: 1.10 x 
10° < p < 851 x 10; polygenic risk score 
associations: 2.17 x 10° < p < 3.88 x 10°""] ex- 
cept for polygenic risk scores of height and 
back pain (p = 5.76 x 107'°) (tables S22 and 
$23). In genomic SEM analyses, we observed 
similar patterns of genetic associations with 
musculoskeletal diseases at the level of gen- 
eral genetic factors (30) (fig. S13 and table S24). 
Taken together, these analyses suggest that in- 
creases in the length of skeletal elements that 
are associated with the hip, knee, and back as a 


ratio of overall height are exclusively associated 
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with an increased risk of arthritis and pain 
phenotypes in those specific areas. 


Evolutionary analysis 


As human SPs are an important part of our 
transformation to bipedalism, we next inves- 
tigated whether variants associated with SPs 
have undergone accelerated evolution in hu- 
mans in two ways. First, following a proce- 
dure by Richard e¢ al. (52) and Xu et al. (53), 
we examined whether genes associated with 
SPs overlapped human accelerated regions 
(HARs) more than expectation. HARs are seg- 
ments of the genome that are conserved through- 
out vertebrate and great ape evolution but are 
notably different in humans (54). We gener- 
ated a null distribution by randomly sampling 
regions matched for overall gene length (30) 
(Fig. 5A). For comparison, we also performed 
the same analysis on summary statistics from 
the ENIGMA Consortium (55) and several com- 
mon quantitative and disease traits from the 
UKB (table S25). Genetic signals from several 
of the SP traits, in particular arm or leg length, 
were significantly enriched in HARs (Arms: 
Legs, Humerus:Height, Arms:Height, Hips: 
Legs, Tibia:Femur, and Hip Width:Height had 
FDR-adjusted p < 0.05). We also observed nom- 
inal enrichment for traits related to hair pig- 
mentation (FDR-adjusted p = 0.013), which has 
also changed substantially in humans com- 
pared with the great apes, and for schizophre- 
nia (FDR-adjusted p = 1.61 x 10 **). However, 
no enrichment (FDR-adjusted p > 0.05) was 
observed for HARs in autoimmune disorders, 


cardiovascular disease, cancer, and overall 
height (Fig. 5A). 

Second, we examined heritability enrich- 
ment using LDSC on genomic annotations that 
reflect divergence at different time points in 
human evolution (Fig. 5B) following an approach 
outlined in Sohail (56) and Hujoel et al. (57). 
These annotations include regions that differ 
in gene regulation between humans and pri- 
mates through stages of early development 
(58), regions that differ in expression between 
adult humans and macaques (59), and regions 
that are enriched and depleted of ancestry from 
archaic humans (60, 67). We then computed 
heritability enrichment, /7(C), which measures 
the proportion of heritability in an annotation 
set divided by the proportion of SNPs in the 
annotation. In our analysis, we also simulta- 
neously incorporated other regulatory elements, 
measures of selective constraint, and linkage 
statistics (baseline LDv2.2 with 97 annotations) 
(57, 62-64) to estimate h°(C) while minimizing 
bias due to model misspecification (30). 

Meta-analyzing across all our SP traits, we 
found enrichment in fetal human-gained en- 
hancers and promoters at early time points 
[7, 8.5, and 12 postconception weeks (pew): 
h°(C) = 8.08, p = 5.91 x 10 **; h7(C) = 3.60, p = 
2.55 x 10-*; h7(C) = 3.65, p = 3.55 x 10-*; table 
$26] but not in adults, suggesting that genes 
associated with SPs are differentially expressed 
in early development between apes and hu- 
mans. Although we acknowledge that the anno- 
tations of differentially regulated elements are 
from developing brain and not skeletal tissues, 
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overlap of HARs with genes associated with SP, autoimmune, dermatological, color intervals mark genetic annotations. Asterisks show significance at 


neurological, endocrine, gastrointestinal, metabolic, psychiatric, and cancer- 
related traits compared with randomly sampled genes of comparable length. 
Traits below the FDR-corrected threshold (0.05) are shown in orange, and 
nonsignificant traits are shown in blue. (B) Meta-analysis of LDSC heritability 
enrichment across 21 SP traits for different evolutionary annotations that 


represent different divergence points in human evolut 
sented in colors refer to fetal human-gained enhance 


adult human-gained enhancers and promoters (orange), ancient selective 


sweeps (purple), putatively introgressed variants from 
genomic regions depleted in Neanderthal and Denisov. 


fetal human-gained brain regulatory elements 
and adult human skeletal regulatory elements 
are correlated at 58% (56, 65). Moreover, we only 
observed enrichment in developing, but not 
adult, tissues, suggesting that the enrichment 
is not driven by confounders of tissue type but 
by differences in development between the two 
species. As a second line of analysis, we also ex- 


LDv2.2 model. KYA, thousan 


trait analyzed. Asterisks show 
ion. Annotations repre- 
rs and promoters (blue), heritability enrichment). Error 
Neanderthals (teal), and 
an ancestry (teal). 


the different annotations, controlling for multi- 
ple hypothesis correction at the level of FDR < 
0.05. Out of 21 of our SP traits (Hip Width: 
Height, Hip Width:Shoulder Width, Arms:Legs, 
Shoulder Width:Torso Length, Hip Width:Arms, 
Shoulder Width:Height, Hip Width:Legs, Shoul- 
der Width:Legs, Shoulder Width:Arms), 9 were 
significantly enriched at 7 pcw at FDR < 0.05 


amined enrichment of individual traits across 
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(Fig. 5C and table $27). In addition, we saw 


FDR < 0.05. A dashed line is drawn at y = 1 (no heritability enrichment). This 
analysis was jointly performed with all genomic annotations in the baseline 


id years ago; MYA, million years ago. (C) Heritability 


enrichment analysis in human-gained enhancers and promoters at 7 pcw for each 


significance at FDR < 0.05 across all genomic 


annotations and traits analyzed in this study. A dashed line is drawn at x = 1 (no 


bars show 1 SE around each estimate. (D) Arm:Leg 


ratio and Hip Width:Height are the only two skeletal traits that show significant 
enrichment in both types (HARs and heritability across differentially regulated regions 
at 7 pew) of evolutionary analysis. Illustration was created with BioRender.com. 


depletion in regions of the genome that were 
depleted for Neanderthal and Denisovan an- 
cestry, particularly for overall leg length [(C) = 
0.44, p = 5.89 x 10°] (table $27). These results 
were consistent with another analysis that 
showed a depletion of Neanderthal informa- 
tive markers in contrast with modern human 
mutations, particularly for anthropometric traits 
(66), and are suggestive of purifying selection. 
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The proportion traits that were significantly 
enriched across both types of evolutionary anal- 
ysis were associated with Arms:Legs and Hip 
Width ratios (Fig. 5D). These results suggest 
that specific SPs, but not overall height or sev- 
eral other quantitative and disease traits exam- 
ined by us or Sohail (56), underwent human 
lineage-specific evolution since the separation 
of humans from the great apes. 


Discussion 


In this study, we used deep learning to under- 
stand the genetic basis of skeletal elements 
that make up the human skeletal form using 
DXA imaging data in a large population-based 
biobank. We carried out genetic correlation 
and factor analysis to characterize the joint 
genetic architecture of these skeletal traits. 
We identified 145 independent genetic loci asso- 
ciated with SPs. We then showed that OA of the 
hip and knee are associated with specific SPs 
that comprise each of those joints. Lastly, we 
performed an analysis to link SPs with regions 
of the genome that were accelerated in human 
evolution, as well as regions of the genome that 
were differentially regulated between great 
apes and humans. 

There have been concerted efforts to use the 
imaging data available from the UKB, but 
most of the work has focused on the magnetic 
resonance imaging (MRI) modality for the 
brain or heart (23, 67). Our study expands 
ongoing efforts in the DXA modality (68, 69), 
which is the key modality for diagnosing mus- 
culoskeletal diseases. We also extend image 
analysis beyond joint-specific DXA images to 
full-body images, which have not been examined 
in the context of bone diseases. We demon- 
strate that deep learning is useful not just in 
phenotyping individuals but also as a tool for 
QC at scale, including the capture of hetero- 
geneous types of error modes. Automated QC 
pipelines have been developed for brain and 
heart MRIs from the UKB, but fewer efforts have 
been made with DXA images (70, 71). We show 
that modification of existing deep-learning ar- 
chitectures enables us to classify DXA images 
by body part and filter full-body images for 
quality, and we have made these modified archi- 
tectures available for use on any DXA dataset. 
Our work also demonstrates the importance 
of having an interconnected dataset of imaging 
data and physical measurements to best lever- 
age biological insights; the scaling and resolu- 
tion issues presented by the imaging data would 
have been impossible to correct for without 
information about individual height in the bio- 
bank metadata. Through transfer learning, we 
also show that deep learning-based landmark 
estimation can produce accurate and replica- 
ble phenotypes for imaging data with limited 
manual annotation. We present the final DXA 
trained models, which are fast, flexible archi- 


tectures that can be deployed rapidly at popu- 
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lation scale, enabling their utility for automated 
phenotyping as imaging data becomes more 
integrated into large population biobanks. 

Beyond methodological improvements for 
biobank-scale analysis, our results provide new 
insights into musculoskeletal biology. Despite 
more than a century of work in genetics in- 
vestigating the development of limbs and the 
overall body plan, a comprehensive genetic map 
of variation that shapes the overall skeletal form 
has been absent. Specifically, which genes and 
how their expression regulates modular devel- 
opment of the forelimb, hindlimb, and other 
long bones have not been fully characterized. 
Additionally, whether natural selection has 
acted on these genes to alter the development 
of limb proportions, thus allowing us to walk 
upright, is still unknown. Our work provides 
a genotype-to-phenotype map of SPs and lays 
the foundation for future assays of the genes 
discovered to understand how they contrib- 
ute functionally to overall phenotype. 

The moderate genetic correlations (a maxi- 
mum of 0.55) observed between SPs indicate 
genetic sharing, particularly among limb-length 
traits, while also highlighting the distinctive 
biology behind the growth of each element. 
Our results are in line with artificial selection 
experiments in mouse lines that show that 
selection for tibia length increased the trait by 
more than 15% across 14 generations but did 
not result in significant change in overall body 
mass (72), a trait that is highly correlated with 
body width (7, = 0.25, p = 1x 10”) but not limb 
length (7, = —0.01, p = 0.53) proportions. Thus, 
our genetic correlation and factor analysis mod- 
els provide insight into constraints placed 
on the evolutionary trajectory of the skeletal 
form both in humans and in vertebrates more 
broadly. 

One important issue that affects the inter- 
pretation of our results is the normalization 
for height for each skeletal length measure that 
we obtained. We did this to look at our primary 
outcome of interest: SPs that are independent 
of height. Several papers have cautioned that 
the interpretation of association studies per- 
formed with adjustment should be carefully 
considered (73, 74). Although this issue affects 
virtually every GWAS that uses age as a co- 
variate in the model (where age is a proxy for 
survivability, a complex trait with a heritable 
basis), our analysis is most similar to GWASs 
conducted for BMI, a trait for which body 
weight is computed as a proportion of height. 
Our results largely show consistent direction 
of effect for loci before and after height adjust- 
ment (30). This suggests that our GWASs for 
SPs are largely identifying loci that are directly 
associated with overall length of particular skel- 
etal elements and is confirmed by low genetic 
correlation between our proportion pheno- 
types and height (mean 7 = 0.19) (table S28). 
However, a minority of these signals could still 


arise from pleiotropic increases or decreases 
in other skeletal elements that affect overall 
height. Thus, in interpreting our results, it is 
important to only view each of our phenotypes 
as proportions of height rather than directly 
associated with individual skeletal element 
lengths themselves. 

Epidemiological studies indicate that OA of 
the hip and the knee frequently do not occur 
together or in combination with OA in other 
large joints, suggesting that local factors are 
important in OA pathogenesis (75-80). Speci- 
fic abnormalities in skeletal morphology are 
now recognized as major biomechanical risk 
factors for the development of OA (81-86). 
The findings presented here of the association 
between specific SPs, but not overall height, 
and joint-specific OA highlight the biomechanical 
role that these proportions play in shaping 
stresses on the joints themselves and highlight 
specific risk factors of clinical relevance. 

Across both types of evolutionary analyses, 
the most significant SP traits were those asso- 
ciated with the proportions of arms and legs, 
as well as proportions of hip width. These results 
are concordant with some of the most notable 
morphological differences between humans 
and the great apes, including arm-to-leg ratio as 
well as pelvic shape, which enabled a transition 
from knuckle-based walking to bipedalism (Fig. 
5D). Numerous studies have proposed a thermo- 
regulatory hypothesis that accompanied the 
primary biomechanical energy efficiency hypo- 
thesis to explain the evolution of these traits in 
early hominin evolution as well as to explain 
differences in anatomy between humans and 
Neanderthals (87, 88). However, only one ex- 
tremely small sample study of 20 individuals 
has been conducted to attempt to test these 
thermoregulatory theories (89). In this work, we 
conducted a large-sample size genetic correla- 
tion analysis between SPs and basal metabolic 
rate as well as whole-body fat-free mass in 
humans using genetic correlation (30). We found 
that an increased Arms:Legs ratio was associated 
with lower basal metabolic rate and lower 
whole-body fat-free mass (p = 9.37 x 107"; p = 
4.05 x 1077), in line with the theory that these 
changes in early human evolution would have 
also increased heat dissipation in early homi- 
nins (table S28). Our results provide genomic 
evidence of selection shaping some of the most 
fundamental anatomical transitions that have 
been observed in the fossil record in human 
evolution—changes in the overall skeletal form 
that confer the distinctive ability of humans to 
walk upright. 


Materials and methods summary 


All patient data, including electronic health 
record data, DXA images, and genotype data, 
were obtained from the UKB (37). To perform 
QC and phenotyping on 31,221 full-body DXA 
images from the UKB, we modified existing 
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deep-learning models (37, 35) used for classi- 
fication and landmark estimation by adding 
final additional training layers with limited 
manual annotation. We used classification mod- 
els to filter images that were poor in quality 
or incorrectly cropped, and we used the land- 
mark estimation model to extract 23 different 
IDPs that include all long-bone lengths as well 
as hip and shoulder width, which we analyzed 
while controlling for height. 

After filtering UKB participants and genotype 
data for QC, we ran GWASs using BOLT-LMM 
(90) for each phenotype and estimated the 
heritability and genetic correlations of these 
traits with each other using GCTA (97). To fur- 
ther investigate the joint genetic architecture of 
skeletal traits, we used genomic SEM to analyze 
the genetic factor structure of the limb and body 
measurements independent of height. Moving 
forward, we focused our remaining analyses 
on limb and body measurements as ratios of 
height (30). 

We used GCTA-COJO (92) followed by link- 
age disequilibrium-based SNP pruning in PLINK 
(93) to find independent loci across our SP 
phenotypes, which were mapped to genes using 
positional-based mapping in PLINK. We used 
MAGMA (94) to run a gene set enrichment 
analysis on our traits and queried the Human- 
Mouse Disease Connection (46) database to 
determine which mouse phenotypes and human 
diseases were associated with SP loci. 

We then examined correlations of SP phe- 
notypes with musculoskeletal disease through 
phenotypic and polygenic risk score analyses. 
First, for phenotypic analysis, we regressed the 
binary outcome of disease or reported pain in 
the hip, knee, and back against SPs while con- 
trolling for clinically relevant covariates that 
are known to affect OA (95), including age, 
sex, BMI, and other factors. For polygenic risk 
score analysis, we generated polygenic risk 
scores for each SP with Bayesian regression 
and continuous shrinkage priors (57) using the 
significantly associated SNPs. We ran a logistic 
or linear regression of the polygenic risk score 
on traits across all individuals, adjusting for 
the first 20 principal components of ancestry 
and imputed sex. 

Evolutionary analyses were carried out on 
our SPs using two major methods. We used 
S-LDSC (62) to estimate the heritability enrich- 
ment for each SP in genomic annotations 
marking different evolutionary periods (30). 
We also scanned for elevated levels of inter- 
sections between genes containing genome- 
wide significant SNPs and HARs (54) through 
a modified version of the method outlined in 
Xu et al. (53) and Richard et al. (52). Addi- 
tional methodological details are available in (30). 
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INTRODUCTION: Antimicrobial peptides (AMPs) 
are host-encoded immune effectors first char- 
acterized for their role in fighting infection. 
AMPs are also important in determining the 
composition of the host microbiome in both 
plants and animals. Although many studies 
have shown rapid evolution of AMPs, little is 
known about the selective pressures driving 
that evolution. 


RATIONALE: The host microbiome should exert 
a substantial selective pressure on host im- 
mune molecules because the host must main- 
tain a delicate balance with its microbial 
associates. Variation in a single AMP can upset 
this balance, as suggested by recent investiga- 
tions across diverse taxa. In Drosophila, pre- 
vious studies have shown the AMP family 
Diptericin (Dpt) evolves rapidly, including a 
major effect of the amino acid polymorphism 
S69R of DptA on host defense against the op- 
portunistic pathogen Providencia rettgeri, and 
Providencia spp. are commonly found in fly 


microbiome communities. Beneficial bacteria 
of the host microbiome also grow out of con- 
trol in flies lacking multiple AMP gene fam- 
ilies, particularly the gut mutualist Acetobacter. 
Drosophila species encode two Diptericin 
genes, DptA and DptB, which are the pro- 
duct of an ancestral duplication stemming 
from a DptB-like gene. To test the idea that 
the host immune repertoire might be spe- 
cifically evolved for controlling common micro- 
biome bacteria, we screened recently made 
Drosophila AMP mutants for defense against 
infection by Acetobacter spp. to determine 
whether any of the AMP genes could explain 
how flies keep this mutualistic microbe under 
control. 


RESULTS: We found that a single AMP gene, 
DptB, explains the host ability to resist infection 
by multiple Acetobacter species. This interaction 
is highly specific: We confirmed that DptA does 
not contribute to defense against Acetobacter, 
whereas DptB does not contribute to defense 


Fruit fly experiments demonstrate that the host immune system is uniquely adapted to common 
environmental microbes. Evolutionary selection can tailor host antimicrobial peptides (chains) to control 
specific microbiome bacteria. As a defense system common across plants and animals, variations in the 
repertoire of antimicrobial peptides are likely important as key risk factors for preventing infection by 
common ecological microbes. [Credit: Diego Galagovsky] 
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against P. rettgeri. We therefore determ: chee 
the evolutionary history of the Diptericin —_= 
and performed a systematic review of micro- 
biome literature of Drosophila and other Dip- 
tera. We realized that there have been at least 
two events of convergent evolution toward DptB- 
like genes in flies feeding on fruit, an ecology 
associated with high levels of Acetobacter. These 
observations suggest that DptB evolved to con- 
trol Acetobacter in the fruit-feeding Drosophila 
ancestor. Moreover, flies that secondarily adopted 
a mushroom-feeding ecology have repeatedly 
lost their DptB genes, alongside an absence of 
Acetobacter in mushroom-breeding sites. A sim- 
ilar pattern of evolution is also seen in flies that 
have developed a plant-parasitic ecology, which 
have lost both DptA and DptB genes and have 
an ecology lacking both Providencia and Aceto- 
bacter. To investigate whether these AMP- 
microbe specificities are shared throughout 
Drosophila, we infected species from across the . 
phylogeny with a diverse complement of DptA- 
and DptB-like genes and alleles. We included 
species with a diversity of DptA-like genes, and 
both Drosophila melanogaster and mushroom- 
feeding flies with or without DptB. Host resis- 
tance to infection by P. rettgeri and Acetobacter 
was readily predicted using just DptA or DptB 
presence and polymorphism status, even across 
fly species separated by about 50 million years 
of evolution. 


CONCLUSION: Our study shows how two 
microbe-specific defences evolved due to an 
ancestral duplication producing two Diptericin 
genes. We describe a one-sided evolutionary 
dynamic wherein the host has adapted its 
immune repertoire to environmental microbes 
rather than coevolution of host and microbe. 
This finding helps to explain the evolutionary 
logic behind the bursts of rapid evolution 
common in AMP gene families across taxa. 
Our results also reveal why certain AMPs can 
have such disproportionate roles in defense 
against specific microbes: They were evolu- 
tionarily selected for that purpose. This real- 
ization suggests that the genome can encode 
“vestigial” immune effectors, AMPs evolved 
for defense against microbes that are no 
longer relevant to the host’s modern ecology. 
Thus, derivation and loss of microbe-specific 
effectors offers the immune system a highly 
effective mechanism for tailoring host de- 
fenses for control of ecologically relevant 
microbes. 


The list of author affiliations is available in the full article online. 
*Corresponding author. Email: m.hanson@exeter.ac.uk (M.A.H.); 
bruno.lemaitre@epfl.ch (B. L.) 
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Antimicrobial peptides are host-encoded immune effectors that combat pathogens and shape the 
microbiome in plants and animals. However, little is known about how the host antimicrobial peptide 
repertoire is adapted to its microbiome. Here, we characterized the function and evolution of the 
Diptericin antimicrobial peptide family of Diptera. Using mutations affecting the two Diptericins (Dpt) 
of Drosophila melanogaster, we reveal the specific role of DptA for the pathogen Providencia rettgeri 
and DptB for the gut mutualist Acetobacter. The presence of DptA- or DptB-like genes across Diptera 
correlates with the presence of Providencia and Acetobacter in their environment. Moreover, DptA- 
and DptB-like sequences predict host resistance against infection by these bacteria across the genus 
Drosophila. Our study explains the evolutionary logic behind the bursts of rapid evolution of an 
antimicrobial peptide family and reveals how the host immune repertoire adapts to changing 


microbial environments. 


nimals live in the presence of a complex 

network of microorganisms known as 

the microbiome. The relationship between 

host and microbe can vary from mutualist 

to pathogen, which is often context de- 
pendent (7). To ensure presence of beneficial 
microbes and prevent infection by patho- 
gens, animals produce many innate immune 
effectors as a frontline defense. Chief among 
these effectors are antimicrobial peptides 
(AMPs), small, cationic, host defense peptides 
that combat invading microbes in plants and 
animals (2-5). Although many studies have 
shown important roles for AMPs in regulating 
the microbiome [reviewed in Bosch and Zasloff 
(6)], presently, we cannot determine why ani- 
mals have the particular repertoire of AMPs 
that their genome encodes. 

Innate immunity has been characterized ex- 
tensively in Drosophila fruit flies (7, 8). Anti- 
microbial peptide responses are particularly 
well characterized in this insect (2, 9, 10). In 
Drosophila, AMP genes are transcriptionally 
regulated by the Toll and Imd nuclear factor-«B 
(NF-«B) signaling pathways (8). Recent work 
has shown that individual effectors can play 
prominent roles in the defense against spe- 
cific pathogens (77-19). Consistent with this, 
population genetics studies have highlighted 
genetic variants in AMPs correlated with sus- 
ceptibility against specific pathogens. A landmark 
study in Drosophila found that a serine-arginine 
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polymorphism at residue 69 in one of the two 
fruit fly Diptericins, “S69R” of DptA (Fig. 1A), 
is associated with increased susceptibility to 
Providencia rettgeri bacterial infection (20). A 
loss-of-function study later showed that flies 
lacking both Diptericin genes (“Dpt**",” flies 
lacking DptA and DptB) are as susceptible to 
P. rettgeri infection as Imd pathway mutants, 
whereas flies collectively lacking five other 
AMP families nevertheless resist infection in 
amanner similar to the wild type (27). Like these 
investigations in Drosophila, a G49E poly- 
morphism in the AMP Calprotectin of Persian 
domestic cats is associated with susceptibility 
to severe ringworm fungal skin disease (22). 
Similar AMP variation is common across ani- 
mals (23-26). However, although P. rettgeri is 
an opportunistic pathogen of wild flies and 
ringworm is common in certain cat breeds, 
whether these AMPs are evolving to selection 
imposed by these microbes is unclear. Given 
recent studies on AMP roles beyond infection 
(27-31), other fitness trade-offs could also ex- 
plain AMP evolution. 

It is now clear that antimicrobial peptides 
shape the microbiome (6), but defining if or 
how the host immune repertoire itself is shaped 
by the microbiome has been challenging. 
Here, we characterized the function and evo- 
lution of the Diptericin gene family of flies, 
revealing that these AMPs were selected to 
control ecologically relevant microbes. 


Results 

Diptericin B is specifically required for defense 
against Acetobacter bacteria 

Acetobacter bacteria are mutualists of Drosophila 
that supplement host nutrition and are common 
in wild flies (32-35). We previously showed that 
a strain of Acetobacter grows out of control in 
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the gut of Relish mutant flies (Rel””°) lacking 
Imd pathway activity and in flies carrying de- 
letions removing 14 AMP genes (AAMPIA) (36). 
Here, we identified this Acetobacter species as 
A. sicerae strain BELCH (fig. S5). Gnotobiotic 
association with A. sicerae did not cause mor- 
tality, even in AAMPI14 flies (fig. S6A). However, 
pricking flies with a needle contaminated with 
A, sicerae killed AAMPI4 flies (12, 36), also 
causing an abdominal bloating phenotype that 
preceded mortality (shown later). This route of 
bacterial infection is similar to what flies ex- 
perience when their cuticle is pierced by natural 
enemies [e.g., nematodes, wasps, and mites 
(37-39)]. Because AAMPI4 flies are killed by 
A. sicerae systemic infection, one or more AMPs 
are likely required to control opportunistic 
infections by this microbe. We therefore used 
flies carrying overlapping sets of AMP muta- 
tions (21), including a Diptericin mutant panel 
affecting each of the two Diptericins (Fig. 1B), 


to narrow down which AMP(s) protects the fly ° 


against A. sicerae infection. 

Ultimately, deleting just DptB fully reca- 
pitulates the susceptibility of AAMP1/4 flies. 
Dpt™, DptB"?, and DptB”? flies suffered 100% 
mortality after infection, with survival curves 
mirroring AAMPI4 and Rel”™” flies; these DptB- 
deficient flies also presented similar levels of 
abdominal bloating (Fig. 2, A and B). Further- 
more, ubiquitous RNA interference (RNAi) 
silencing of DptB caused both mortality and 
bloating after A. sicerae pricking (fig. S6, B and 
C). Conversely, DptA®°"*, DptA4*”?, and even 
AAMPS flies collectively lacking five other AMP 
gene families [Drosocin, Attacin, Defensin, 
Metchnikowin, and Drosomycin (21)] resisted 
infection in a manner comparable to wild type. 
Finally, DptB mutants display increased 
A. sicerae loads, preempting mortality (Fig. 2C), 
suggesting a direct role for DptB in suppress- 
ing A. sicerae growth. 

After revealing the critical importance of 
DptB in defense against A. sicerae, we inves- 
tigated whether DptB has a broader role in 
the control of other Acetobacter species. To 
this end, we infected flies with a panel of Aceto- 
bacter species including A. aceti, A. indonesiensis, 
A. orientalis, A. tropicalis, and A. pomorum. 
Although these Acetobacter species displayed 
different levels of virulence, DptB specifically 
promoted survival and/or prevented bloating 
against all virulent Acetobacter species (figs. S7 
and S8). 

Collectively, these results indicate that DptB 
is an AMP of specific importance in defense 
against multiple Acetobacter species, revealing 
another example of high specificity between 
an innate immune effector and a microbe rele- 
vant to host ecology. Because Acetobacter are 
common in fermenting fruits (40, 47), the major 
ecological niche of Drosophila, DptB might 
be especially important for flies to colonize 
this niche. 
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Fig. 1. Diptericins of D. melanogaster. (A) Alignment of D. melanogaster mature DptA and DptB peptides, which are ~52% identical. The DptAS©°® site is noted 


(Q in DptB, and see fig. S 


for protein folding predictions). (B) The two Diptericin genes are located in tandem on chromosome 2R:55F with only 1130 base pairs (bp) 


between them. DptA+°? encodes a premature stop (W40*). Strain DptB’ encodes a 37-bp deletion overlapping the DptB intron-exon boundary, causing loss of 
function (fig. S2). The Dpt* deficiency removes 2137 bp, deleting the coding region of both genes. DptB also encodes a secreted propeptide (PP), similar to 
Drosophila Attacins (figs. S1, S3, and S4). SP, signal peptide. 
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Fig. 2. DptB is specifically required for defense against A. sicerae. (A) Flies lacking DptB bloat after A. sicerae systemic infection. Each data point reflects the 
average from one replicate experiment (~20 males). (B) Sum survival curves showing that DptB is critical for defense against A. sicerae. (C) A. sicerae bacterial load 
increases before mortality. Each data point reflects the average of five pooled flies. n°, number of experiments. 


Diptericin A is specifically required to defend 
against P. rettgeri 

The Gram-negative bacterium P. rettgeri was 
isolated from the hemolymph of wild-caught 
flies (20), suggesting that it is an opportunistic 
pathogen in Drosophila. Previous studies showed 
that Diptericins play a major role in surviving 
P. rettgeri infection (20, 27), including a marked 
correlation between the DptA S69R polymor- 
phism and resistance against this bacterium: 
Flies encoding arginine were more susceptible 
than flies encoding serine at this site (20). 
However, it is unknown if DptB contributes to 
defense against P. rettgeri. 

We therefore infected our panel of Dipter- 
icin mutants by pricking with P. rettgeri (Fig. 3A). 
We confirmed the DptA*”* allele reduces sur- 
vival after P. rettgeri infection, here with a con- 
trolled genetic background (P < 2 x 107"). 
DptA*” flies also paralleled mortality of 
Dpt®*' flies lacking both Diptericin genes 
(P = 0.383). Initially, we found that DptB*° 
flies showed higher susceptibility to P. rettgeri 
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(P = 944 x 10"), correlated with higher 
bacterial load (fig. S9A). However, our isogenic 
DptB®° flies had only ~57% induction of the 
DptA gene compared with our isogenic Dpt4° 
wild type at 7 hours after infection (fig. S9B). 
By contrast, we observed that DptB*’ flies carry 
the DptA 69 allele, have wild-type DptA express- 
ion (fig. S2), and actually survive infection by 
P. rettgeri even better than DptA*” (P = 5.03 x 
10*; Fig. 3A). Moreover, silencing DptB by 
RNAi did not significantly affect survival against 
P. rettgeri (P > 0.05; Fig. 3B). We therefore con- 
clude that DptB itself does not have a major 
effect on resistance to P. rettgeri, although a 
cis-genetic background effect found in DptB®° 
flies causes lesser induction of DptA and, ac- 
cordingly, higher susceptibility. 

Our Diptericin mutant panel shows that 
DptA plays a major role in defense against 
P. rettgeri but not A. sicerae. Conversely, DptB 
plays a major role against A. sicerae but not 
P. retigeri. Thus, these two Diptericin genes 
are highly specific effectors explaining most 
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of the Imd-mediated defense of D. melano- 
gaster against systemic infection by either 
bacterium. 


The Diptericin family shows multiple bursts of 
rapid evolution across Diptera 


Given the high specificity of D. melanogaster 
Diptericins for different ecologically relevant 
microbes, we next investigated whether host 
ecology might explain Diptericin evolution. 
First, we reviewed the evolutionary history of 
Diptericins across Diptera using newly availa- 
ble genomic resources (Fig. 4). 

Diptericins are found across brachyceran fly 
species, indicating an ancient origin of this 
antibacterial peptide (>150 million years ago) 
(42, 43). The extant Drosophila DptB-like gene 
was originally derived in the Drosophilidae an- 
cestor through rapid evolution [figs. S11 and 
$12; first shown in (43, 44)]. Later, a dupli- 
cation of DptB gave rise to the DptA locus in 
the Drosophilinae ancestor ~50 million years 
ago [date per (45)], which began as a DptB-like 
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Fig. 3. DptA is specifically A 
required for defense against 

P. rettgeri. (A) Sum survival curves 
of Diptericin mutants after infec- 
tion with P. rettgeri. (B) Silencing 
DptB by RNAi (Act>DptB-!R) 

does not significantly affect fly 
survival compared with Act>OR 
controls (RNAi validation is shown 
in fig. S10). 


Percent alive 


0 1 


gene but then evolved rapidly after the du- 
plication [shown in (44); also see figs. S11 and 
$12 and table S1]. Given these repeated bursts 
of evolution and only ~52% similarity between 
DptA and DptB (Fig. 1A), distinct antibacterial 
activities are not necessarily surprising. In re- 
viewing Diptericin evolution, we further realized 
the DptA®® residue of D. melanogaster is also 
present in the subgenus Drosophila through 
convergent evolution: Different codons are 
used by the subgenus Sophophora (e.g., AGC) 
and subgenus Drosophila (e.g., TCA) to pro- 
duce DptA®” residues (table $1), providing fur- 
ther evidence that adaptive evolution selects for 
serine at this site [complementing (20) and 
(44)]. Moreover, across species, there is a high 
level of variation at this site: In addition to the 
S69R polymorphism, this site can also encode 
either glutamine (Q) or asparagine (N) in 
DptA of other Drosophila species. Q/N is also 
seen at the aligned residue of DptB across 
Drosophila species (Q56N in DptB). These four 
residues (S, R, Q, and N) are derived compared 
with the ancestral aspartic acid residue (D) 
found in most other dipterans (table S1). 

This analysis suggests that the extant DptB- 
like gene first evolved in the drosophilid an- 
cestor, whereas DptA emerged from a duplication 
of a DptB-like gene, followed by rapid diversifica- 
tion. The DptA®®” residue was also derived at 
least twice, and this site is highly polymorphic 
across genes and species. These repeated bursts 
of evolution suggest that fly Diptericins evolved 
responding to selection in the drosophilid 
ancestor. 


Diptericin evolution correlates with microbe 
presence in host ecology 


The diversity of Drosophila ecologies, along 
with many wild-caught fly microbiome studies, 
places us in a unique position to pair each 
host’s microbial ecology with patterns in the evo- 
lution of their Diptericins, which have microbe- 
specific importance. 

We performed a systematic review of the 
Diptera microbiome literature (table S82). 
Acetobacter bacteria are regularly found across 
species feeding on rotting fruits in microbiome 


Hanson et al., Science 381, eadg5725 (2023) 


P. rettgeri, B 
OD=1.0,25°C 
DptAs69 
DptAS69R 
. DptA4822 
. DptBk° 
3 DptB43 
= 3 DptSk1 
23 4 5 6 7 3. +» AAMP14 0 
Time (days) 6 » RelE20 


studies (32, 34, 46, 47). However, Acetobacter 
appear to be absent from rotting mushrooms 
(48), and are largely absent in wild-caught 
mushroom-feeding flies themselves (48, 49). 
Further, Providencia bacteria related to P. rettgeri 
are common in species feeding on both rotting 
fruits and mushrooms [(34) and table S2]. We 
observed that three drosophilid species with 
mushroom feeding ecology, D. testacea, D. 
guttifera, and Leucophenga varia, have inde- 
pendently lost their DptB genes (Fig. 4) (43). 
Thus, three independent DptB loss events 
have occurred in flies with a mushroom-feeding 
ecology specifically lacking in Acetobacter. 
There is another Drosophila sublineage with 
an ecology that lacks Acetobacter: Scaptomyza 
(Fig. 4, green branch). Scaptomyza pallida 
feeds on decaying leaf matter and mushrooms, 
whereas Scaptomyza flava and Scaptomyza 
graminum feed on living plant tissue as leaf- 
mining parasites (50). The S. flava microbiome 
shows little prevalence of either Acetobacter or 
Providencia (51). We investigated whether 
these Scaptomyza species had pseudogenized 
either of their copies of DptA (two genes, 
DptAI and DptA2) or DptB (one gene). We 
found independent premature stop codons 
in DptAI in the leaf-mining species S. flava 
(Q43*) and S. graminum (G85*), but not in 
the mushroom-feeding S. pallida (fig. S13). 
We also analyzed the promoter regions of 
these DptA genes for the presence of Relish 
NF-«B transcription factor binding sites [“Rel- 
«B” sites from (52); fig. S13A], confirming that 
the S. pallida DptA1 promoter retains Rel- 
«B sites and likely immune induction. Thus, 
Scaptomyza DptAl genes show pseudogen- 
ization specifically in the leaf-mining species 
that lack Providencia in their present-day 
ecology. However, DptA/ appears functional in 
S. pallida, a mushroom-feeding species likely 
exposed to Providencia through its ecology. 
Scaptomyza DptA2 genes show variable presence 
of Rel-«B sites, but no obvious loss-of-function 
mutations in coding sequence, and DptA2 
remains expressed in S. flava (fig. S13B). 
Screening the DptB genes of Scaptomyza, we 
found no obvious loss-of-function mutations in 
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coding sequences. However, all three Scapto- 
myza species lack Rel-«B sites in their DptB 
promoter regions (fig. SI3A). Whether due to 
plant feeding or mushroom feeding, none of 
these Scaptomyza have an ecology associated | 
with Acetobacter. Using RNA-sequencing data 
from the S. flava midgut (53), we confirmed a 
lack of expression of both the pseudogene 
DptA1 and DptB compared with the abundant 
expression of DptA2 (fig. S13B). We conclude 
that Scaptomyza species have independently 
pseudogenized DptA and DptB genes correl- 
ated with presence or absence of Providencia 
or Acetobacter in their ecology. 

Finally, convergent evolution toward DptB- 
like sequence has occurred in another lineage 
of “fruit flies”: Tephritidae (43, 44) (see figs. 
S11 and S12 for protein alignment and para- 
phyly of tephritid Diptericins clustering with 
drosophilid DptB). This family of Diptera is 
distantly related to Drosophilidae (ast common 
ancestor ~111 million years ago). Like Drosophila, 
many tephritid lineages (e.g., Trypetinae and 
Dacinae) feed on fruits, but like Scaptomyza, 
one lineage, Tephritinae, parasitizes live plants 
(Fig. 4, purple branches). In light of the present 
study, it would seem that the tephritid spe- 
cies that feed on Acetobacter-associated fruit 
(40, 54, 55) have convergently evolved a DptB- 
like gene, including a parallel Q/N trans-species 
polymorphism at the critical Diptericin residue 
(table S1). Like Scaptomyza, plant-parasitizing 
tephritids lack both Acetobacter and Providencia 
in their microbiomes (43) and have lost their 
Diptericin genes (Fig. 4) (43). Thus, DptB-like 
genes evolved in both Tephritidae and Droso- 
philidae species associated with a fruit-feeding 
ecology in which Acetobacter is a dominant mem- 
ber of the microbiome. The fact that DptB-like 
genes are not found in species unless their an- 
cestor had a fruit-feeding ecology suggests two 
things: (i) that the Acetobacter-rich fruit-feeding 
niche was colonized before the derivation of 
DptB-like sequence and (ii) that selection 
imposed by Acetobacter resulted in the ances- 
tors of both Tephritidae and Drosophilidae 
evolving DptB-like genes to help control this 
microbe. 
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Fig. 4. Diptericin evolution correlates with host e 


cology and presence of Acetobacter or 


Providencia. Diptericin presence was screened in diverse Diptera. The residue aligned to the DptAS°® 


or DptB2°" polymorphism is shown. The DptB-like 


sequence evolved first in the ancestor of 


Drosophilidae, and the serine-coding allele in DptA evolved at least twice (fig. S11 and table S1). The 
relatedness of the codons used to encode the S/R/Q/N polymorphism enables their diversification in the 


subgenus Sophophora (summary in top left). Fruit-f 


eeding tephritids convergently evolved a DptB-like 


gene (figs. S11 and S12) including a parallel Q/N polymorphism, and P. variegata encodes an independent 
DptB duplication, in which the two daughter genes encode either version of the Q/N polymorphism. 


Within 


Drosophilidae (bottom part), three species with mushroom-feeding ecology have lost their 


DptB genes: L. varia, D. testacea, and D. guttifera. In both Drosophilinae (Scaptomyza) and Tephritidae 


(Tephritinae), divergence to plant feeding is also co 
Systematic review of microbiome studies (table S2 
Acetobacter in the host ecology is correlated with Dp 
the gene loss was confirmed. Copy number variatio 


rrelated with loss of Diptericin genes (fig. S13). 
suggests that the absence of Providencia and 
tA and DptB loss, respectively. Red [x] indicates that 
n is noted in table S1. Phylogenetic cladogram was 


drawn from consensus of multiple studies (45, 65-67). 


Our phylogenetic and ecological survey 
reveals multiple parallels among the host 
immune effector repertoire, ecology, and the 
associated microbiome. This suggests that 
these dipteran species have derived DptA- or 
DptB-like genes as their evolutionary solution 
to control important bacteria found in their 
microbiome. By contrast, specific Diptericin 
genes become superfluous when their hosts 
shift to ecologies lacking Diptericin-relevant 


Variation in DptA or DptB predicts host resistance 
across species separated by 50 million years 

of evolution 

Our study indicates that among the suite of 
immune genes involved in Drosophila host 
defense, the AMPs DptA and DptB are crit- 
ically important against two environmentally 
relevant bacteria: the opportunistic pathogen 
P. rettgeri and the gut mutualist Acetobacter. 
Moreover, our phylogeny-microbiome analysis 


microbes, leading to gene loss. 
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reveals substantial correlations in terms of gene 


21 July 2023 


emergence, retention, and loss. If DptA and 
DptB really evolved to control P. rettgeri and 
Acetobacter, then the outcomes of P. retigeri 
and A. sicerae infection across species should 
be readily predicted using just variation in 
these two Diptericins. We therefore chose 12 
Drosophila species with variation in the poly- 
morphic site in DptA and presence or absence 
of DptB, and infected them with P. rettgeri or 
A, sicerae. Experiments in D. melanogaster 
suggest that DptA®*™* affects defense against 
P. rettgeri, but how DptaA®°** or DptA®™ at- 
fects defense against this bacterium has never 
been tested. Similarly, the effect of Dpt! ay 
on defense is also untested, so we have no a priori 
expectations for how these polymorphisms 
affect peptide activity. To analyze these ex- 
periments, we used a linear mixed-model 
approach (see the materials and methods), in- 
cluding D. melanogaster flies from our Diptericin 
mutant panel as experimental controls. This _ 
helped to calibrate our model for the expected 
effect size for variants of DptA or DptB with- 
in a single species or across species. We also 
conducted these experiments at 21°C to avoid 
heat stress to some species, which reduced 
D. melanogaster mortality compared with 
25°C (fig. S14). 

Summaries of fly species mortality are shown 
in Fig. 5. As found in D. melanogaster, resis- 
tance to P. rettgeri was associated with a Dpta’? 
allele across species. Indeed, DptA*®® found 
in either D. melanogaster or D. willistoni cor- 
relates with increased susceptibility to P. retigeri 
(t = -9.59, P < 2 x 10°). Drosophila yakuba 
with DptA®°”" was also more susceptible than 
its close relatives, suggesting that asparagine 
(N) is an immune-poor allele against P. retigeri 
(t = -7.26, P = 4 x 107"). Further, DotA"? flies 
(D. sugukii and D. immigrans) had similar 
survival after P. retigeri infection compared 
with DptA®® flies (¢ = +0.07, P = 0.35), sug- 
gesting that glutamine (Q) is a competent 
defense allele against P. retigerit when coded 
by DptA (Fig. 5A). Overall, ~74% of variation 
in susceptibility can be attributed to varia- 
tion in DptA alone as a fixed effect (marginal 
R? = 0.743). 

For infections with A. sicerae, the absence 
of DptB in the mushroom-feeding species 
D. testacea and D. guttifera was correlated 
with increased susceptibility compared with 
their close relatives (t = -10.83, P < 2 x 10°"). 
Mushroom-feeding flies displayed increased 
susceptibility to A. sicerae infection that was 
independent of DptB status (¢ = -3.77, P = 2 x 
10~*). However, even within this susceptible 
lineage, DptB loss still increased mortality 
to a similar extent as DptB deletion in D. 
melanogaster, indicating that the contribution 
of DptB to defense against A. sicerae is in- 
dependent of host genetic background (Fig. 
5B). Overall, ~87% of variation in susceptibility 
to A. sicerae can be explained by just DptB 
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Fig. 5. Diptericins predict pathogen-specific survival across Drosophila. Host phylogeny, ecology, and Diptericin complement are shown. Clean injury is shown 
in fig. S15. (A) Susceptibility to P. rettgeri infection varies across species, with survival largely explained by the DptA allele, particularly within the subgenus 
Sophophora (blue-shaded species). (B) Susceptibility to infection by A. sicerae is predicted by presence or absence of DptB, although mushroom-feeding flies also had 
a higher susceptibility to A. sicerae infection that was independent of DptB loss. Each data point represents one replicate experiment using 20 male flies. 


absence and host ecology as fixed effects (mar- 
ginal R? = 0.868). 

These survival data establish that the specific 
resistance conferred by Diptericins observed 
in D. melanogaster applies across Drosophila 
species separated by ~50 million years of evo- 
lution. We conclude that the host immune 
repertoire adapts to the presence of ecolog- 
ically relevant microbes through the evolution 
of specialized AMPs as weapons to combat 
specific microbes. 


Discussion 


Susceptibility to infection often correlates with 
host phylogeny (56, 57), although host ecology 
greatly influences microbiome community 
structure (34, 58). Early studies of immune 
evolution suggested that AMPs were mostly 
generalist peptides with redundant function, 
suggesting that AMP variation was not caused 
by adaptive evolution (2, 3). Instead, studies 
on immune adaptation have found whole 
pathway-level effects or have identified factors 
specific to a given species [e.g., host-symbiont 
coevolution (59-67)]. As a result, despite a rich 
literature on immunity-microbiome interac- 
tions, the evolutionary logic explaining why 
the host genome encodes its particular im- 
mune effector repertoire has been difficult to 
approach experimentally. 

Here, we identified how ecological microbes 
promote the rapid evolution of effectors of the 
immune repertoire, tailoring them to be high- 
ly microbe specific. The two D. melanogaster 
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Diptericin genes also provide a textbook ex- 
ample of how gene duplication can promote 
immune novelty, equipping the host with ex- 
tra copies of immune tools that can be adapted 
to specific pathogen pressures. The Drosophila 
Diptericin mechanism of action has been elu- 
sive because of technical difficulties in pep- 
tide purification (2, 70). However studies using 
Phormia terranovae highlight many directions 
for future research [(42, 62, 63) and see dis- 
cussion in the supplementary materials]. Fu- 
ture studies combining both fly and microbe 
genetics should be fruitful in learning how 
host and microbe factors determine speci- 
ficity. One goal of infection biology is to try 
to identify risk factors for susceptibility pres- 
ent in individuals and populations. Our study 
suggests that characterizing the function of 
single effectors, interpreted through an evolu- 
tion-microbe-ecology framework, can help to 
explain how and why variation after infection 
occurs within and between species. 

The fly Diptericin repertoire reflects the 
presence of relevant microbes in that species’ 
ecology. Conversely, loss or pseudogenization 
of Diptericins is observed when the microbes 
they target are no longer present in their en- 
vironment. In a sense, this means that some 
AMPs seen in the genomes of these animals 
are vestigial: Immune genes evolved to fight 
microbes that the extant host rarely encounters 
(e.g., DptB in D. phalerata). Indeed, flies that 
lack DptB genes are likely disadvantaged on 
Acetobacter-rich food resources, where the 
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possibility of Acetobacter systemic infection 
poses a constant threat. Thus, loss of this AMP 
makes recolonization of Acetobacter-rich rot- 
ting fruits a risky proposition, entrenching the 
host in its derived ecological niche. 

Although other mechanisms of defense surely 
contribute to resistance, Diptericins have 
evolved recurrently as the fly genome’s solu- 
tion to control specific bacteria. Given our find- 
ings, we propose a model of AMP-microbiome 
evolution that includes gene duplication, se- 
quence convergence, and gene loss, informed 
by the host ecology and the associated micro- 
biome (Fig. 6). In doing so, we thus explain 
one part of why various species have the par- 
ticular repertoire of AMPs that they do. This 
ecology-focused model of AMP-microbiome 
evolution provides a framework for under- 
standing how host immune systems rapidly 
adapt to the suite of microbes associated with 
a new ecological niche. These findings are 
likely of broad relevance to immune evolution 
in other animals. 


Methods summary 


Full materials and methods are found in the 
supplementary materials. In brief, D. melano- 
gaster fly stocks included both natural muta- 
tions and a transgenic insertion disrupting DptB, 
which were isogenized into the DrosDel iso- 
genic background, as indicated in Fig. 1 with 
the prefix “iso.” Nonisogenic DSPR A3 flies 
(DptB**) were from (64). Survival experiments 
were performed and analyzed as described 
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microbe, P. rettgeri, including 
convergent evolution of the 
critical S69 residue. In the 
sublineage including D. melano- 
gaster, codon volatility enables 
any of S, R, Q, or N residues. 
The sublineage including 

D. guttifera and S. flava evolved 
its S residue using a different S 


dN>dS > Dptas® 
(2x independently) 


fo 


o) 


JS fe 


S/R/QIN 


2) 


ot & 


DptB 


codon, evolutionarily fixing this 
residue (table S1). (4) In 

D. melanogaster, host ecology 
remains associated with both 
Acetobacter and Providencia, 


D. melanogaster see 


D. guttifera 


Extant species 


Bacteria aaa Diverse outgroup ecologies @@» 
eS elke ta ea a 
Acetobacter a) Py 
(1) dN>ds > Dpte s J 
eG» : 
Providencia 
Ancestral DptA locus birth & 
GD drosophilids by duplication o - J 
Other : 


ubgenera 
split ba) 
—) 
Ss N 
absent a AQ absent 
loss DptA loss DptB loss 
y— => — 


S. flava ap 


which continually select for maintenance of both genes. (5) In mushroom-feeding D. guttifera, Providencia remains a threat, but mushroom ecology lacks Acetobacter. 
Consequently, selection is relaxed on DptB, leading to pseudogenization. (6) In leaf-mining species such as S. flava, Acetobacter and Providencia are absent 

from the microbiome, and thus selection is relaxed on both Diptericin genes. This AMP-evolution-ecology framework makes sense of why AMPs have microbe 
specificity and helps to explain how shifts in microbial ecology can promote rapid evolution for AMP-microbe specificity or loss of “vestigial AMPs” that are relevant 
primarily against microbes that the host no longer encounters. 


previously (27), with the temperature and op- 
tical density at 600 nm (ODg¢o0) of the bacteria 
(“OD”) indicated within figures. Twenty male 
flies were used per experiment unless other- 
wise indicated, and at least three replicate 
experiments were performed for all data shown 
in main figures, with raw data available in the 
supplement. In fly bloating, bacterial load, and 
gene expression graphs, error bars indicate 
SD. The cladogram and annotations in Fig. 4 
were generated by literature review (table S2), 
with gene search and annotation methods 
per (43). 

The script for Fig. 5 is available in the sup- 
plement. Briefly, we used a linear mixed-model 
(‘Ime4” and “performance” packages in R) with 
species relatedness and experiment block in- 
cluded as random factors and host ecology and 
variation in DptA or DptB loci including copy 
number or alleles at key residues (D. melano- 
gaster DptA N52 or S69 alleles) as fixed fac- 
tors. When loss of function was present, the 
allele was called as “deleted.” We explored 
our model both by Akaike information crite- 
rion model selection and by iterative linear 
mixed-model testing in which nonsignificant 
fixed factors (e.g., DptB allele in explaining sur- 
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vival after P. rettgeri infection) and their inter- 
actions were relegated to being random factors 
in the final model. These two approaches pro- 
vided similar results, and we used values from 
linear mixed models in the main text. 
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HEART DISEASE 


Immune-mediated denervation of the pineal gland 
underlies sleep disturbance in cardiac disease 


Karin A. Ziegler*?, Andrea Ahles*?, Anne Dueck'”, Dena Esfandyari>’, Pauline Pichler’, 

Karolin Weber’, Stefan Kotschi°, Alexander Bartelt”**°, Inga Sinicina®, Matthias Graw®, 

Heinrich Leonhardt’, Ludwig T. Weckbach?*, Steffen Massberg”®, Martina Schifferer°™, 

Mikael Simons!°"?, Luciano Hoeher’®, Jie Luo", Ali Ertiirk"“1>"4, Gabriele G. Schiattarella’®7°"”, 
Yassine Sassi/®"°2°, Thomas Misgeld’°""2, Stefan Engelhardt?2* 


Disruption of the physiologic sleep-wake cycle and low melatonin levels frequently accompany cardiac 
disease, yet the underlying mechanism has remained enigmatic. Immunostaining of sympathetic axons in 
optically cleared pineal glands from humans and mice with cardiac disease revealed their substantial 
denervation compared with controls. Spatial, single-cell, nuclear, and bulk RNA sequencing traced 

this defect back to the superior cervical ganglia (SCG), which responded to cardiac disease with 
accumulation of inflammatory macrophages, fibrosis, and the selective loss of pineal gland-innervating 
neurons. Depletion of macrophages in the SCG prevented disease-associated denervation of the 

pineal gland and restored physiological melatonin secretion. Our data identify the mechanism by which 
diurnal rhythmicity in cardiac disease is disturbed and suggest a target for therapeutic intervention. 


n healthy humans, the sleep-wake cycle is 

tightly controlled by the daytime-dependent 

diurnal secretion of melatonin that achieves 

synchrony with Earth’s 24-hour day and 

night (diurnal) cycle (7-3). Melatonin syn- 
thesis occurs in the pineal gland and is, to- 
gether with its secretion, tightly controlled by 
sympathetic neurons that project from the 
superior cervical ganglia (SCG). In addition to 
pineal gland-innervating neurons (4), the SCG 
harbors heart-innervating neurons (5, 6). The 
end organ-innervating neurons in the SCG 
receive input from central sympathetic nuclei 
(7). In heart disease, low melatonin levels and 
disruptions of sleep-wake rhythmicity fre- 
quently occur (8-10). These disruptions con- 
siderably contribute to the overall disease 
burden, yet there is no consensus as to their 
treatment (17). The mechanism underlying 
the altered sleep-wake cycle in cardiac disease 
has remained elusive, and the role of pineal 
gland innervation has not been addressed. In 
this work, we systematically charted the pineal 
gland-controlling neuronal circuits in cardiac 
disease. Our results indicate severe and likely 
irreversible immune-mediated destruction of 


a specific subset of sympathetic neurons that 
control diurnal rhythmicity of pineal melatonin. 
These data reveal that defective sympathetic 
control of the pineal gland underlies the disturb- 
ance of diurnal rhythmicity in cardiac disease. 


Cardiac disease causes sympathetic denervation 
and dysfunction of the pineal gland 


The pineal gland and its peripheral sympa- 
thetic regulation (72) play a key role in diurnal 
rhythms. Additionally, both diurnal (9) and 
sympathetic disruptions (13) are highly prev- 
alent in chronic cardiac disease. Thus, we 
hypothesized that in such pathological set- 
tings, the neuronal control of pineal gland func- 
tion might be impaired. To test this hypothesis, 
we assessed pineal gland sympathetic inner- 
vation in humans with cardiac disease com- 
pared with heart-healthy controls. We collected 
postmortem pineal glands from seven heart 
disease patients and nine heart-healthy con- 
trols, performed tissue clearing, and stained 
them for the sympathetic marker enzyme 
tyrosine hydroxylase. In pineal gland tissue 
from the patients with cardiac disease, we ob- 
served a significant reduction of axonal den- 


sity (Fig. 1, A and B). We reasoned that this 
of innervation of the pineal gland could L-- 


q 


Chec 


upde 


vide an explanation for the lower melatonin 
levels observed in humans with cardiac dis- 
ease (74) and therefore sought to investigate 
the underlying cellular mechanism in an ani- 
mal model of cardiac disease. We surveyed 
pineal gland function in mice subjected to 
transverse aortic constriction (TAC) to exert 
left ventricular pressure overload, a condition 
that leads to pathologic cardiac hypertrophy 
and failure (Fig. 1, C to F, and fig. S1, A and B) 
(15). Plasma melatonin concentrations in these 
TAC-treated mice were reduced, relative to 
controls, 4 weeks after the respective inter- 
vention (Fig. 1D). We then asked whether this 
change in pineal gland function caused al- 
terations of a comprehensive standard set of 
parameters for diurnal rhythmicity, includ- 
ing respirometry to survey metabolic rates, 
nutrient uptake, body mass, and activity (de- 
termined with infrared arrays). We found a 
marked disruption of diurnal rhythmicity with 
a reduced amplitude in TAC mice compared 
with controls. This relative loss of diurnal 
rhythmicity was not associated with changes 
in total energy expenditure or activity, which 
indicates that the shift in cycling is a specific 
effect rather than being related to unspecific 
sickness of TAC mice (Fig. 1, E and F). 

Mice allow the genetic labeling of sympa- 
thetic neurons with superior signal-to-noise 
ratio axon imaging (J6). We next sought to 
generate mice with fluorescently labeled sym- 
pathetic axons by cross breeding dopamine- 
B-hydroxylase (Dbh)-Cre mice with tdTomato"™ 
mice (fig. SIC). Application of cardiac pressure 
overload for 4 weeks recapitulated the hall- 
marks of pineal gland pathology observed in 
humans with chronic cardiac disease, namely 
a significantly lower sympathetic axonal den- 
sity compared with sham-operated animals 
(Fig. 1, G and H, and fig. SID). We observed simi- 
lar effects of pineal gland denervation (Fig. 1, I 
and J) in another disease model: heart failure 
with preserved ejection fraction (HFpEF) induced 
by chronic metabolic and hypertensive stress (17). 
Pineal gland cellular composition (fig. SIE) as 
well as pineal gland area (fig. SIF) remained 
unchanged, prompting us to analyze next the 
SCG from which the pineal gland-innervating 
axons originate. 


l 


15 


nstitute of Pharmacology and Toxicology, Technical University Munich (TUM), Munich, Germany. 7DZHK (German Centre for Card 
Munich, Germany. “Institute for Cardiovascular Prevention (IPEK), Faculty of Medicine, Ludwig-Maximilians-Universitat (LMU) Munchen, Munich, Germany. “Institute for Diabetes and Cancer, 
Helmholtz Center Munich, Neuherberg, Germany. °Department of Molecular Metabolism & Sabri Ulker Center for Metabolic Research, Harvard. T.H. Chan School of Public Health, Boston, MA, 
USA. Institute of Legal Medicine, Faculty of Medicine, Ludwig-Maximilians-Universitat (LMU) Munchen, Munich, Germany. Human Biology & Bioimaging, Faculty of Biology, Ludwig-Maximilians- 
Universitat (LMU) Munchen, Munich, Germany. 8Medizinische Klinik und Poliklinik |, Klinikum der Universitat Munchen, Munich, Germany. “Institute of Cardiovascular Physiology and Pathophysiology, 
Biomedical Center, Ludwig-Maximilians-Universitat (LMU), Planegg-Martinsried, Germany. ?°DZNE (German Center for Neurodegenerative Diseases), Munich, Germany. “Munich Cluster for Systems 
Neurology (SyNergy), Munich, Germany. “Institute of Neuronal Cell Biology, Technical University Munich (TUM), Munich, Germany. “Institute for Tissue Engineering and Regenerative Medicine (iTERM), 
Helmholtz Center Munich, Neuherberg, Germany. “Institute for Stroke and Dementia Research, Klinikum der Universitat Mtinchen, Ludwig-Maximilians-Universitat (LMU) Munchen, Munich, Germany. 
DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, Berlin, Germany. "Max Rubner Center for Cardiovascular Metabolic Renal Research (MRC), Deutsches Herzzentrum der Charité 
(DHZC), Charité-Universitatsmedizin Berlin, Berlin, Germany. ‘’ Translational Approaches in Heart Failure and Cardiometabolic Disease, Max Delbriick Center for Molecular Medicine in the Helmholtz 


iovascular Research), Partner Site Munich Heart Alliance, 


Association (MDC), Berlin, Germany. 'Fralin Biomedical Research Institute at Virginia Tech Carilion, Roanoke, VA, USA. “Department of Biomedical Sciences and Pathobiology, Virginia-Maryland College of 
Veterinary Medicine, Virginia Tech, Blacksburg, VA, USA. “°Department of Internal Medicine, Virginia Tech Carilion School of Medicine, Roanoke, VA, USA. 


*Corresponding author. Email: stefan.engelhardt@tum.de 


Ziegler et al., Science 381, 285-290 (2023) 


21 July 2023 


1 of 6 


RESEARCH | RESEARCH ARTICLE 
A Pineal gland Axon Pineal gland Axon B > ] + ° 0.025 , 
(original image) tracing (original image) tracing © Oo 3 
o | ° a ae 
=] om 
go EB 
ze allS800 Eg 0.014 © 
= 
s418y 22 (f- 
© x} 19 6 
6 ow s 5 T 
< om 2 0- 
\ & \ & 
SF SS 
oe oe 
~~ oS nS Ss 
Ca cS Oo 
ws 
aw C RC OM COE CC Sham 
Cc Day ; D 150-, Eo, = = oe F 4, ee c= Ae 
Or Cardiac pressure 1001 08 ° 
overload (TAC) 504 22 ° ae 
or Sham E 20+ o ~ 1.54 : 34 oo |e 
D ° = g oo) z fox) 
= 45] £ } | @ jegoe © 3 7% oo 
Melatonin ‘= — 1.04 Baie ° & ge 0 
28 = ° ° = Ss 0 o few Qo 6 
measurement £ 104 i o ct oF, % 1.24 9P OOo 
& fo) <x 
45. Assessment of 3 5 > 054 1+ ogo BR 8 8 
diurnal rhythm elo 2 ome 
fe ui 
7 Ooo 0 T 0.8 
42p Ohman harvest Ctrl TAC 0 20 40 #60 ~~ 80 S & Re 
Time (hours) ew 4 oe 4“ 
ineal glan xon ineal glan xon —~ 805 _# — p=0.0! 
G Pineal gland A Pineal gland A H 2 0.0574 
(original image) tracing (original image) tracing ry 2 3 
i N 
co) 4 = 
2 £ § 0.104 7 
x 40-19 0 2s 
oy 52 £8 = 
oO 7) 
o Qe 4 ° 
@ 4 oa” l/c) 
cS f= 
S S 
< om 2 o- 
& © & G 
ee ee 
= 407 me 004g 
a) a _ ef) 
oye eg o[? 
S , fF Oo = Oo 
= 204 ff £9 0.02] 8 
oe ° £3 ° 
oO one) 
2 os 
oo 4 o 4 
iy fS 
S S 
< oH < 0H 
Ss < oS 
RS ms EY & 
oe or 


Fig. 1. Cardiac disease causes sympathetic denervation and dysfunction 
of the pineal gland. (A) Representative images of human pineal glands 

ne hydroxylase (TH). Scale bar, 1 mm. 
length. Axonal parameters were normalized __ pi 


after clearing and staining for tyrosi 
(B) Quantification of axon area and 


C 


neal gland 


to pineal gland area. (C) Timeline of mouse behavioral study. (D) Measurement TH 15 weeks after control 


of melatonin in plasma from C3H/HeJ mice 28 days after the respective 

of diurnal rhythm in C3H/HeJ mice after 
uding metabolic measurements [including and n = 9 (heart disease) 
(TAC) for (D); n = 11 (sham) and n = 15 (TAC) for (E) and (F); and n = 5 (sham), 


intervention. (E and F) Assessment 
sham and TAC. Mouse behavior inc 


oxygen consumption (VO2) and energy expenditure] as well as activity 
inuously recorded. The ratio between dark 
and light was calculated for activity and energy expenditure (kcal per hour). 


(beam breaks per minute) was con 


Chronic cardiac disease causes fibrotic scarring 
and hypertrophy of SCG in mice and humans 
Morphometry and histopathology of SCG de- 
rived from mice subjected to TAC revealed 
significant hypertrophy with significantly 
higher ganglionic volumes (Fig. 2, A and B). 
SCG from TAC mice further showed exten- 
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(J) Quantifi 
to pineal g 


n = 6 (TAC), n = 5 (chow) 
was performed using Stud 


sive fibrotic scarring when compared with 
that of sham-operated animals, which sug- 
gests massive, possibly irreversible organ dam- 
age (Fig. 2, A and B). 

We then asked whether these findings can 
be translated to chronic cardiac disease in hu- 


mans and prospectively obtained a total of 38 


(G) Representative images of mouse pineal glands from dopamine B-hydroxylase (DBH) 
e/tdTomato"™ mice 28 days after sham and TAC. Scale bar, 200 yum. 

(H) Quantification of axon area and length. Axonal parameters were normalized to 
area. (I) Representative images of mouse pineal glands stained for 


(chow) and HFpEF treatment. Scale bar, 200 um. 
area and length. Axonal parameters were normalized 
are mean + SEM of n = 7 specimens (heart healthy) 
for (A) and (B); n = 12 mice (sham), and n = 10 


, and n = 4 (HFpEF) for (G) to (J). Statistical analysis 
ent’s t test. *P < 0.05, **P < 0.01, ***P < 0.001. 


SCG specimens from 19 individuals with or 
without heart disease upon autopsy (table S1). 
As in mice with cardiac disease, SCG from 
patients with heart disease were significantly 
enlarged and exhibited increased interstitial 
scarring, with the stained matrix volume amount- 
ing to ~70% of total intraganglionic volume 
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Fig. 2. Chronic cardiac disease A 
causes fibrotic scarring 

and hypertrophy of SCG in 
mice and humans. (A) (Left) 
Representative images of mouse 
superior cervical ganglia 42 days 
after sham and TAC surgery stained 
with Fast Green and Sirius Red. 
Scale bar, 200 um. (Right) Magni- 
fied images of the boxed regions. 
Scale bar, 50 um. (B) Quantification 
of SCG area and fibrosis. (C) (Top 
eft) Paraffin sections stained with 
Fast Green and Sirius Red. Scale 
bar, 500 um. The boxes indicate 
the region of the magnified 
images. (Top right) Magnification 
of the boxed regions. Scale bar, 
200 wm. (Bottom) Representative 
images of superior cervical ganglia 


Heart disease 


from individuals with and without 


heart disease. (D) Quantification of il ee * 
SCG weight and Sirius Red* area G {| iz. ie! 


in transverse SCG sections. (E to 
G) Evaluation of SCG morphometry 
in a prospective clinical study. 

(E) (Left) Workflow of the clinical 
study. (Right) Exemplary setup of 
an SCG ultrasound examination. 
(F) Representative images from 


Heart Heart 
healthy failure 


. Echocardio- 
human SCG acquired by ultra- graphy 
sound (top, healthy; bottom, heart 
failure). Scale bar, 3.5 cm. (G) Upper neck 
Quantification of SCG length and ganglion 

ultrasound 


ejection fraction. Data are mean + 
SEM of n = 7 mice (sham) and n = 5 
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(TAC) for (A) and (B); n = 6 (heart healthy) and n = 7 (heart disease) for (D); n = 9 (heart healthy) and n = 10 (heart disease) for (C); n = 7 to 8 study participants (heart healthy) 
and n = 6 to 9 (heart failure) for (E) to (G). Statistical analysis was performed using Student's t test. *P < 0.05, **P < 0.01, ****P < 0.0001. 


(Fig. 2, C and D). The extent of ganglionic 
fibrosis significantly correlated with the extent 
of myocardial remodeling (fig. S2A). 

The degree of SCG hypertrophy led us to 
hypothesize that the size of the SCG may serve 
as an imaging biomarker for heart failure. In a 
clinical study, we therefore prospectively eval- 
uated SCG dimensions with quantitative ultra- 
sound imaging in patients with heart failure 
and healthy controls (Fig. 2, E and F, fig. S2B, 
and tables S2 and S3). Evaluation with current 
clinical ultrasound equipment revealed a sig- 
nificant increase of SCG dimensions in pa- 
tients with heart failure (Fig. 2G), with eight 
out of nine patients displaying a ~150% higher 
SCG length. We observed a significant corre- 
lation between SCG size and ejection fraction 
(fig. S2C), which is consistent with functional 
interdependence. 


Heart disease triggers macrophage infiltration 
and loss of pineal gland—innervating neurons 
in SCG 


To dissect and quantitatively assess the cellu- 
lar basis for the histomorphologic alterations 
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of the SCG, we performed single-cell and single- 
nuclei RNA sequencing (scRNA-seq and snRNA- 
seq, respectively). Ganglia were isolated from 
control mice and mice that had been subjected 
to TAC (Fig. 3A). In total, 20,780 cells passed 
the quality control and were used as input for 
computational analysis. The cellular compen- 
dium comprised five major cell types: sympathetic 
neurons, Schwann cells, fibroblasts, endothelial 
cells, and immune cells (Fig. 3A and fig. S3A). 
snRNA-seq furthermore identified two distinct 
cell clusters among the sympathetic neurons, 
the smaller of which selectively expressed mela- 
tonin receptor 1A (Mtnria). We assigned this 
Mtnria* neuronal cell cluster as bona fide pineal 
gland-innervating neurons, because the target 
organs of sympathetic innervation typically sec- 
rete specific guidance cues that allow for selec- 
tive axon growth during embryonic development 
and hence, specific innervation (8, 19) (Fig. 3, A 
and B, and fig. S3A). We cannot exclude, how- 
ever, that further neuronal populations or sub- 
populations exist within the SCG, which may 
exert additional specific functions, and that fur- 
ther subpopulations besides Minria* cells con- 


tribute to pineal gland innervation. All neurons 
share typical markers for sympathetic neurons— 
including tyrosine hydroxylase (7h), dopamine- 
B-hydroxylase (Dbh), neuropeptide Y (Vpy), 
peripherin (Prph), and synapsin II (Syn2)— 
in line with reports of cell inventories of other 
sympathetic ganglia (7, 20). Pineal gland- 
innervating neurons were, in addition to Minria, 
characterized by expression of small nucleolar 
RNA host gene 11 (Snhgi1), Ankyrin 2 (Ank2), 
Hand2 opposite strand 1 (Hand2osI1), Semaphorin 
6D (Semaéd), EPH Receptor A5 (Ephad5), neu- 
ronal cell adhesion molecule (Nrcam), synapto- 
some associated protein 91 (Snap91), and 
calcium voltage-gated channel subunit o1 A 
(Cacnala) (Fig. 3B and fig. S3A). Although we 
found consistent expression for Cacnala mRNA, 
this may not translate to functional calcium 
current (27). After Schwann cells, a high number 
of immune cells were also detected within the 
SCG, most of which were macrophages (5 to 
8% in scSeq) (fig. S3, B to F). Spatial sequenc- 
ing (MERSCOPE platform, Vizgen) of cryosec- 
tions prepared from mouse SCG by using 140 
RNA probes allowed transcriptome mapping 
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Fig. 3. Heart disease triggers macrophage infiltration and loss of pineal 
gland-innervating neurons in SCG. (A) UMAP projection of 9172 cellular 
nuclei from SCG isolated 5 days after sham and TAC. (B) Dot plot of neuron 
subtype-specific genes in single-nuclei sequencing. (C) SCG cell segmentation 
(left) and transcriptional profile (middle) assessed with spatial RNA sequencing. 
(Right) Representative area in high magnification. Overlay shows the merging 
of cell segmentation and neuron cluster—defining marker genes. Scale bar, 1 mm 
(magnification, 25 um). (D) Quantitative assessment of cellular composition 

in (top) mouse and (bottom) human SCG by means of genetic deconvolution of 
RNA sequencing libraries (percent of relative fraction). (E) CD68* cells (arrows) 
in the respective ganglia with quantification. Scale bar, 20 um. (F) In situ 
hybridization and immunofluorescence assays in mouse superior cervical ganglia 


with single-cell resolution (Fig. 3C and fig. 
$4). Automated, machine learning-based cell 
segmentation combined with unsupervised 
transcriptome-based UMAP (uniform mani- 
fold approximation and projection) clustering 
yielded six distinct cell clusters, including two 
major subclusters of sympathetic neurons 
(Fig. 3C and fig. S4B). One cluster expressed 
a set of marker genes highly similar to the 
bona fide pineal gland-innervating neurons 
identified by snSeq (Fig. 3B). The staining of 
the entire sympathetic nervous system in op- 
tically cleared intact adult mice allowed us to 
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generate a three-dimensional dataset of sym- 
pathetic innervation, which permitted the unin- 
terrupted tracing of the sympathetic innervation 
originating from the SCG, through the upper 
neck, into the cranial cavity, and all the way to 
the pineal gland (movie S1). These axons exclu- 
sively originated from the cranial pole of the 
SCG, which is in line with the cranial positioning 
of the respective neurons (fig. S4B). 

We then determined a set of marker genes 
for the major cell populations in the SCG, which 
enabled the genetic deconvolution (22) of deep 
RNA-seq data of mouse and human ganglia 


Snap91 


%Mtnrta* TH" cells 


with quantification. The boxes indicate the region of the close-up images. Scale 
bar, 20 um (magnification, 10 um). (G) CD68" cells (arrows) in the respective 
ganglia with quantification. Scale bar, 20 um. Data are from n = 4 mice (control) 
and n = 3 (TAC) for (D); n = 7 (sham), n = 5 [TAC_d18 (day 18)], and n = 4 
[TAC_d28 (day 28)] for (E); n = 6 (control and TAC_d28) and n = 3 (TAC_d18) for 
(F); n = 3 specimens (heart healthy and heart disease) for (D); n = 9 (hear 
healthy) and n = 10 (heart disease) for (G). Student's t test was applied for (D) 
and (G), and one-way ANOVA with Bonferroni's post-hoc test was applied for 
statistical analysis of (E) and (F). *P < 0.05, **P < 0.01, ***P < 0.001. FB, 
yoFB, myofibroblasts; MP, macrophages; EC, endothelial cells; 
sympathetic_pineal, pineal gland-innervating sympathetic neurons; sympathe- 
tic_other, sympathetic neurons innervating other organs; SC, Schwann cells. 


+ 


(fig. S5A). Ganglia from mice that had been 
subjected to TAC displayed a marked increase 
of the macrophage cell fraction and a signif- 
icant reduction of sympathetic pineal gland- 
innervating neurons (sympathetic_pineal) (Fig. 
3D) at day 18 after the procedure, well before 
decompensated heart failure was fully estab- 
lished (day 28) (fig. S5B). To independently 
validate the findings of changes in cellular com- 
position, we antibody-stained tissue cryosec- 
tions of SCG obtained from TAC-subjected or 
control mice for the macrophage marker CD68 
(Fig. 3E) and probed for mRNA expression of 
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Fig. 4. Local clodronate injection attenuates pineal gland denervation and 
dysfunction. (A and B) Assessment of diurnal rhythm in C3H/HeJ mice after 
he time period before melatonin 
injection. (A) Timeline of the study. From day 35 on, mouse behavior including 
oxygen consumption (VOz2) and activity (beam breaks per minute) was 
ecording, 50 ug melatonin was once 
injected intraperitoneally at the time of physiological peak of endogenous 
melatonin production. (B) (Right) The dark-light ratio was calculated for activity. 
(C) Neuroimmune interaction map in superior cervical ganglia generated through 
iTALK R package) from single-cell 

SCG transcriptomes. Relative differences of ligand-receptor pairs between sham 
and 5 days after TAC. The color indicates the acute TAC-induced changes in 


sham and SCGx. “Untreated” refers to 


continuously recorded. After 6 days of 


intercellular ligand-receptor interaction 


the sympathetic pineal gland-innervating neu- 
ronal cell marker gene Mitnria using in situ 
hybridization (Fig. 3F). Quantitative analysis 
confirmed a significant increase of CD68* macro- 
phages and a significant loss of Mtnria* neurons 
(Fig. 3, E and F). We did not observe macro- 
phage accumulation in non-cardiac-innervating 
ganglia (fig. S5D) or elevated levels of general 
inflammation markers (fig. S5E). 

The loss of sympathetic pineal gland- 
innervating neurons was already significant 
18 days after TAC and rapidly progressed toward 
an almost entire loss of Mtnrla* neurons (Fig. 
3F). Electron microscopy that we performed 
28 days after TAC corroborated signs of axonal 
damage in SCG neurons, which was indicated 
by electron-dense alterations predominantly 
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localized at the axon initiation segment (AIS) 
(fig. S6). These alterations at the AIS were 
predominantly found in large but not small 
neurons, which suggests that pineal gland- 
innervating neurons were affected by AIS dark- 
ening (23). 

Transcriptome analysis of human SCG au- 
topsy samples from patients with established 
cardiac disease recapitulated a significant 
macrophage infiltration (albeit to a lesser 
extent at this late stage of cardiac disease) 
and the loss of sympathetic pineal gland- 
innervating neurons (Fig. 3D). In addition, 
immunofluorescent detection of intragan- 
glionic macrophages in these samples con- 
firmed their disease-associated accumulation 
(Fig. 3G). 
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ligand-receptor pairings: white indicates weak, pink indicates moderate, and red 
indicates strong changes. (D) Timeline of clodronate experiments. (E) CD68* 
cells (arrows) in the respective ganglia with quantification. Scale bar, 10 um. 
(F) Representative images of mouse pineal glands stained for tyrosine 
hydroxylase (TH). Scale bar, 100 um. (G) Quantification of axon area and length 
normalized to pineal gland area. (H) 
from C3H/HeJ mice 28 days after the respective intervention. Data are means + 
SEM of n = 6 to 7 mice for (A) and (B), n = 3 (TAC control, day 18) and n = 6 
(TAC clodronate, day 18) for (E), and n = 6 to 7 (TAC control, day 28) and 
n= 5 (TAC clodronate, day 28) for (G) and (H). Student's t test was applied for 
statistical analysis of (E), (G), and (H). Two-way ANOVA with Bonferroni's post- 
hoc test was applied for statistical analysis of (B). *P < 0.05, **P < 0.01. 


easurement of melatonin in plasma 


Local clodronate injection attenuates pineal 
gland denervation and dysfunction 

We next sought to assess the effects of pineal 
gland denervation on melatonin-related diur- 
nal rhythmicity. Pineal gland denervation was 
achieved by means of bilateral, surgical re- 
moval of the SCG (SCGx) and diurnal rhyth- 
micity was assessed with continuous recording 
of activity and respiration by indirect calorim- 
etry (Fig. 4, A and B). SCGx resulted in a 
marked disruption of diurnal rhythm (Fig. 4B). 
Diurnal rhythmicity, however, could be com- 
pletely restored by supplementation of mela- 
tonin (Fig. 4B). The observed infiltration of the 
SCG by macrophages and the concomitant 
loss of sympathetic pineal gland-innervating 
neurons prompted us to speculate that these 
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findings might be related. To this end, we 
mapped intercellular communication in SCG 
by means of transcriptome profiling (24). At 
early stages of cardiac disease (5 days after 
TAC), the most pronounced alterations oc- 
curred in the communication network be- 
tween macrophages and sympathetic pineal 
gland-innervating neurons (Fig. 4C). We then 
aimed to specifically interfere with the pre- 
sumed detrimental macrophage-neuron in- 
teraction and to deplete macrophages locally 
in the SCG in TAC-treated mice (Figs. 4, D to 
H). Mice were subjected to TAC, which was 
followed by weekly intraganglionic injec- 
tions of the macrophage inhibitor clodronate. 
The SCG of the clodronate-injected mice ex- 
hibited significant depletion of macrophages 
(Fig. 4E). Local clodronate injection into the 
SCG prevented pineal gland denervation and 
dysfunction, as indicated by increased sym- 
pathetic axonal density within the pineal 
gland and significantly increased melatonin 
levels (Fig. 4, F to H). 

We then asked whether the detrimental 
macrophage-neuron interaction could in prin- 
ciple be recapitulated in a defined ex vivo sys- 
tem or whether other cell types are necessary 
for this interaction. Coculture of sympathetic 
neurons with proinflammatory (“M1-like”) but 
not with control macrophages inhibited neurite 
outgrowth and induced cell loss of nicotine- 
stimulated but not of quiescent sympathetic 
neurons (fig. $7, A to E). Treatment with 
cobra venom factor (CVF), a broad-spectrum 
complement inhibitor that blocks macrophage 
activation (25), effectively prevented both neu- 
ronal cell loss and rescued neurite outgrowth 
in this setting (fig. S7, B and C). Thus, activated 
macrophages play a central role in sympathe- 
tic neuron cell death, and local macrophage 
inhibition may prove to be therapeutically 
effective. 


Discussion 


Disruption of the sleep-wake pattern and mel- 
atonin secretion is an established consequence 
of cardiac disease, yet the underlying mecha- 
nism has remained elusive. This study iden- 
tifies sympathetic denervation of the pineal 
gland as the underlying cause and suggests a 
means for therapeutic intervention. 

The fibrotic remodeling and the loss of neu- 
rons appear to be specific for the SCG with 
milder abnormalities such as neuronal hyper- 
trophy (a typical sign of their chronic over- 
activation) in the stellate ganglion and adrenal 
medulla (fig. S5C) (26). Our coculture experi- 
ments suggested that the simultaneous occur- 
rence of inflammatory macrophage polarization 
and chronic activation of sympathetic neurons 
initiated ganglionic disease. This suggests that 
the discriminating factor in the SCG is the extent 
of the inflammatory response and/or its macro- 
phage-dominated nature compared with that of 
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other ganglia. This would also be compatible 
with reports on the stellate ganglion, where 
mainly T cell infiltration (and, to a much lesser 
extent, macrophage infiltration) was observed 
in cardiac disease (27-29). 

Our findings call for further studies to eluci- 
date the mechanisms that trigger macrophage 
infiltration and their activation in the SCG. 
These further studies should include analy- 
ses of the relative roles of cardiac afferent and 
spinal preganglionic projections and CXCL2, 
ILIA, and TNFA, recruitment and activation fac- 
tors previously reported for neuron-associated 
macrophages (J6). 

Concerning the clinical problem of sleep 
disturbances in cardiac disease, our data call 
for the exploration of ganglion-targeted ther- 
apeutic modalities. Such an anti-inflammatory 
intervention could be local and minimally in- 
vasive and may prevent irreversible damage of 
autonomic ganglion structure and function. 
In those patients in whom such damage is 
manifested, the supplementation of melato- 
nin should be investigated in systematic clin- 
ical trials to further extend the current limited 
evidence (30). Our findings on the pronounced 
hypertrophy of the SCG as detected by ultra- 
sound call for a prospective study with serial 
follow-ups to determine whether this simple 
and robust biomarker can identify cardiac pa- 
tients who are at risk for imminent pineal de- 
nervation and are therefore candidates for 
therapeutic and preventive intervention. 

Our study suggests a paradigm in which car- 
diac disease affects an anatomically distant 
organ and that stems from the spatial integra- 
tion of organ-specific neuronal subpopulations 
in sympathetic ganglia (fig. S8). Beyond its im- 
plications for cardiac disease, the role of sym- 
pathetic ganglia as relay stations between organs 
warrants further exploration with regard to 
other disease entities. 
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In situ photocatalytically enhanced thermogalvanic 
cells for electricity and hydrogen production 
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Wenjing Huang‘, Ana Jorge Sobrido®, Bingqing Wei® 


, Xuanhua Li 


1,2% 


High-performance thermogalvanic cells have the potential to convert thermal energy into electricity, 
but their effectiveness is limited by the low concentration difference of redox ions. We report an 

in situ photocatalytically enhanced redox reaction that generates hydrogen and oxygen to realize a 
continuous concentration gradient of redox ions in thermogalvanic devices. A linear relation between 
thermopower and hydrogen production rate was established as an essential design principle for devices. 
The system exhibited a thermopower of 8.2 millivolts per kelvin and a solar-to-hydrogen efficiency of 
up to 0.4%. A large-area generator (112 square centimeters) consisting of 36 units yielded an open- 
circuit voltage of 4.4 volts and a power of 20.1 milliwatts, as well 0.5 millimoles of hydrogen and 

0.2 millimoles of oxygen after 6 hours of outdoor operation. 


hermal energy (heat fluxes of 0 C° to 100 C° 

above ambient) can come from a variety 

of natural and industrial processes, in- 

cluding solar and geothermal energy, trans- 

portation, manufacturing, electronics, 
and biological entities (J-4). Heat can be con- 
verted into electrical energy by using thermo- 
electric technologies in combination with solar 
illumination, but conventional thermoelectric 
technologies are limited by their low thermo- 
power of microvolts per kelvin (uV K™’) (5-8). 
Thermogalvanic and thermodiffusion cells are 
two alternatives that offer high thermopower 
of millivolts per degree (mV K7') and enable a 
scalable route for directly converting heat to 
electricity (9-11). Thermodiffusion cells based 
on the thermodiffusion effect of ions (AD) have 
been reported to have a considerable thermo- 
power of 24 mV K”|, but their discontinuous 
electrical output has made them unreliable 
for practical applications (12, 13). By contrast, 
thermogalvanic cells (TGCs) generate contin- 
uous electric power by operating under a tem- 
perature difference (AT), which hold promise 
for practical applications (14). Previous studies 
have reported a thermopower of 3.7 mV K! 
and a normalized power density (Pmax/AT *) of 
6.8 mW m ” K™®, obtained from the heat pro- 
vided by an ideal laboratory heater and cooler 
plates (2), whereas a thermopower of 13 mV K* 
and a Pmax/AT? of 0.03 mW m~ K~ was ob- 
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tained with solar thermal energy devices (15). 
We report an in situ-enhanced thermopower 
of 8.2 mV K ‘and a Pmax/AT? of 8.5 mW m? K® 
in a TGC (3.14 cm”) for harvesting solar ther- 
mal energy by using a photocatalytic water 
splitting process with simultaneous hydrogen 
(Hy, 11.3 umol hour”) and stoichiometric oxy- 
gen production (Os, 5.5 umol hour’). 

Thermopower is associated with the solvent- 
dependent entropy difference (AS) between 
redox ions and the concentration difference 
(AC) of redox ions between hot and cold sides 
(2, 16, 17). Thermopower can be enhanced by 
increasing AS of the redox ions. For example, a 
polymer network bonded with ferrocyanide 
(FeCN* ) increased AS to achieve a thermopower 
of 1.7 mV K‘ (J8). Introducing the acrylic quater- 
nary ammonium monomer into the Fe**/Fe”* 
electrolyte to adjust the redox couple’s solva- 
tion shells led to a larger AS with an enhanced 
thermopower of 2.0 mV K” (9). However, owing 
to the spontaneous diffusion of redox ions into 
a homogeneous state, the AC of these TGCs is 
thermodynamically unstable and decreases 
to near zero (Fig. 1A) (20-22). Guanidinium 
cations can selectively induce the crystallization 
of FeCN* ions and improve the ACbetween hot 
and cold sides while leaving the concentration 
of ferricyanide (FeCN® ) unchanged on both 
hot and cold sides. This approach resulted in 
a limited AC and poor thermopower of 3.7 mV 
K”™ (). Therefore, the construction of a high 
and continuous AC for both redox ions between 
the hot and cold sides and interpretation of the 
intrinsic AC modulation mechanism constitute 
a tremendous challenge. 

We report the design of an in situ photocat- 
alytically enhanced thermogalvanic device that 
can boost the thermopower to 8.2 mV K! and 
provide solar-to-hydrogen (STH) efficiency 
of up to 0.4% (Fig. 1B). An O.-evolution photo- 
catalyst (OEP) aided the forward reaction from 
FeCN* to FeCN*” and facilitated H.O to O. 
production (23), resulting in a high FeCN* con- 


q 


centration on the hot side. The H,evolu} He 
photocatalyst (HEP) converted the FeCN'—~ 
FeCN® and facilitated H, production from H,O 
(24), increasing the amount of FeCN® on the 
cold side. A high local concentration of FeCN* 
near the hot side thermodynamically enhanced 
the oxidation reaction FeCN* —FeCN* + e 
with more electrons transferred to the hot elec- 
trode, whereas a high local concentration of 
FeCN® near the cold side thermodynamically 
enhanced the reduction reaction FeCN® + 
e-—FeCN* with more electrons attracted from 
the cold electrode, enabling a continuous reac- 
tion to produce a high voltage. As the photocat- 
alytic reaction proceeded, a H* concentration 
gradient was also formed within the system. 
Thus, the thermopower of the photocatalytical- 
ly enhanced TGC was further increased by 
enhancing AC of FeCN”, FeCN*, and H* along 
with the improved AS. 


Cell fabrication and AC construction 


We constructed an integrated system using a 
multistep polymerization method (Fig. 1C; see 
the Experiments section for details). Polyacrylic 
acid (PAA) was chosen as the matrix given its 
simple synthesis and low cost and was filled 
with water to ensure ion migration and photo- ‘ 
catalytic reaction. We then added FeCN*’* to 
the PAA precursor to serve as redox ions for the 
thermogalvanic reaction. Oxygen vacancies in 
WO, photocatalysts (O,-WO3) with CoO, and ‘ 
sulfur vacancies in ZnIn,S, photocatalysts (S,- 
ZIS) with Pt were introduced into the upper 
and lower layers of the PAA precursor, respec- 
tively, to serve as an OEP and HEP, respectively 
(Fig. 1, D and E). The introduction of vacancies 
in the photocatalysts can enhance the charge 
transport and improve the photocatalytic ef- 
ficiency (fig. S1, E and F), which is consistent 
with our previous report (25). CoO,, as an oxygen 
production cocatalyst, was predeposited on 
the surfaces of O,-WO; by a calcination method, 
whereas Pt, as a hydrogen production cocata- 
lyst, was predecorated on the surfaces of S,- 
ZIS by a reduction method. Cocatalysts can ,. 
increase the driving force for extracting car- 
riers and provide photocatalytic active sites 
to boost the photocatalytic efficiency (fig. $1, E 
and F) (26). The pristine TGC was hereafter 
assigned to the PAA-FeCN*/? complex. After 
O,-WO3 with CoO, and S,-ZIS with Pt were 
introduced into the TGC, we referred to the 
cell as the O,-WO;/TGC/S,-ZIS system. Scan- 
ning electron microscopy (SEM) and trans- 
mission electron microscopy (TEM) were used 
to verify that the O,-WO3/TGC/S,-ZIS sys- 
tem was successfully constructed (fig. S2), 
and Raman spectroscopy data suggested that 
FeCN*’*" was evenly distributed throughout 
the system. The depth of distribution for 
O,-WOs and S,-ZIS in the O,-WO3/TGC/S,-ZIS 
system was 1 to 3 mm and 7 to 9 mm, respec- 
tively (fig. S3). 
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Fig. 1. In situ photocatalytically enhanced concentration gradient of redox 
ions in a TGC. (A and B) Schematic depiction of the thermogalvanic cell 
(TGC) and photocatalytically enhanced TGC. (©) Schematic illustration of the 
fabrication for the Oy-WO3/TGC/S,-ZIS system. FeCN*” and FeCN* concen- 
trations were 0.34 M and 0.26 M, respectively. (D and E) SEM images of (D) 
Oy-WO3 with CoO, and (E) Sy-ZIS with Pt. (F) Infrared thermal image of the 
0,-WO3/TGC/S,-ZIS under 100 mW cm? light irradiation at a 30° angle. (The 


We investigated AT by placing the system in 
water under light irradiation. The top of the 
O,-WO:;/TGC/S,-ZIS system absorbed light that 
was converted to heat, which created a 16.8 K 
temperature gradient (Fig. IF). The O,-WO;/ 
TGC/S,-ZIS system exhibited high light ab- 
sorbance within the wavelength range of 300 
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to 1000 nm, which ensured that the photocat- 
alytic process took place (Fig. 1G). Two redox 
peaks in the potential window of -0.28 to 0.28 V 
(versus Pt) were observed in the cyclic voltam- 
metry curve that were attributed to the reduc- 
tion of FeCN® to FeCN* and the oxidation of 
FeCN*" to FeCN* , respectively (12), confirm- 
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cold side was controlled at 298 K.) TGC denoted PAA-FeCN*’*. (G) Absorption 
spectrum of the O,-WO3/TGC/S,-ZIS. (H) Cyclic voltammetry (CV) curve of the 
0,-WO3/TGC/S,-ZIS. (land J) Real-time monitoring of FeCN*’* through in situ 
Raman study for the hot and cold sides of the O,-WO3/TGC/S,-ZIS under light 
irradiation (100 mW cm). (K) Time courses of FeCN* and FeCN® concentrations 
on the hot and cold sides of Oy-WO3/TGC/S,-ZIS under light irradiation (100 mW 
cm”). Error bars represent the standard deviation of 10 repeated measurements. 


ing the continuous thermoelectric and photo- 
catalytic reactions (Fig. 1H). 

In situ Raman spectroscopy was used to mon- 
itor real-time changes in FeCN® and FeCN* con- 
centrations for the O,-WO;/TGC/S,-ZIS system 
under light irradiation (fig. S4) (27). Peaks ob- 
served at 2058 and 2091 cm” were attributed 
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Fig. 2. Thermoelectric performances of the TGC and 0,-WO3/TGC/S,-ZIS. 
(A) Schematic diagram of the photocatalytically enhanced TGC under light 
irradiation. The cross-sectional area of the cell was 3.14 cm? with a radius of 1 cm. 
The distance between the two electrodes was 0.9 cm (upper electrode: transparent 
Au@Cu mesh; bottom electrode: Au@Cu foil). (B) Voc response versus time 
curves of TGC and 0,-WO3/TGC/S,-ZIS for five cycles (reaction condition: 2 ml of 
pure water, 100 mW cm” light irradiation). (C) Thermopowers of TGC and O,-WO3/ 


to the A;, and E, modes of FeCN*, respectively. 
An additional peak located at 2128 cm was 
observed, corresponding to FeCN®” (Fig. 11). 
Under light irradiation, the characteristic peak 
intensities of FeCN* and FeCN* on both sides 
of the TGC without photocatalysts remained 
unchanged. On the hot side of the O,-WO3/TGC/ 
S,-ZIS system, with the aid of photocatalysts, 
the peak intensities of FeCN* and FeCN* grad- 
ually increased and decreased, respectively, with 
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an increase in the illumination time. This ob- 
servation indicated that the O,-WO; photo- 
catalysts converted FeCN® to FeCN* on the 
hot side. The peak intensities remained con- 
stant after 60 min, indicating that the dis- 
tribution of FeCN® and FeCN* had reached 
a steady state. Similarly, on the cold side, the 
S,-ZIS photocatalysts caused FeCN®” peak in- 
tensities to gradually increase and FeCN* 
peak intensities to gradually decrease for up 
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TGC/S,-ZIS (relative contributions of AD+AS and AC with photocatalysts to the 
enhanced thermopower). (D) Current-voltage curves and corresponding power 
densities of TGC (AT = 13.8 K) and O\-WO3/TGC/S,-ZIS (AT = 16.8 K). (E) Hydrogen 
and oxygen evolution rates of T@C and O,-WO3/TGC/S,-ZIS (reaction conditions: 3 mg 
of O-WO3, 2.5 mg of SZIS, 100 mW cm® light irradiation). (F) Comparison of the 
thermopower and normalized power density values for various TGCs (table S2). Error 
bars in (C) and (E) represent the standard deviation of 10 repeated measurements. 


to 60 min (Fig. 1J). Ultraviolet-visible (UV-vis) 
absorption spectroscopy was used to mea- 
sure the concentration changes during the 
time course of illumination (Fig. 1K and fig. 
S5) (1). The result showed that after 60 min of 
illumination, the AC of FeCN* and FeCN®” 
between hot and cold sides was 0.44 mol 
liter’, and pH measurements showed a AC 
of H* of 3.3 x 10~’ mol liter between hot and 


cold sides of the system (fig. S6). 
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Fig. 3. Validating the working principle of photocatalytically enhanced TGCs. (A) Schematic of the photocatalytically enhanced TGC of Oy-WO3/TGC/S\-ZIS. 
(B and C) Working principle of photocatalytically enhanced TGCs of 0,-WO3/TGC/S,-ZIS: (B) hot side and (C) cold side. (D) Relationship of thermopower and Hz 
evolution rate in the photocatalytically enhanced TGCs. Error bars represent the standard deviation of 10 repeated measurements. 


Thermoelectric performance 

We evaluated the thermoelectric performance 
of TGC and O,-WO;/TGC/S,-ZIS by using a 
Au-coated Cu (Au@Cu) mesh with 91% trans- 
mittance as a transparent hot electrode and 
a Au@Cu foil as the cold electrode (Fig. 2A 
and fig. S7). When exposed to light irradiation 
(100 mW cm”), the TGC and O,-WO;/TGC/S,- 
ZIS showed AT of 13.8 and 16.8 K, respectively 
(fig. S8). The open-circuit voltage (V,,) on the 
O,-WO;/TGC/S,-ZIS achieved 137 mV, much 
higher than that of TGC of only 37 mV (fig. S9). 
The V,. of the O,-WO;/TGC/S,-ZIS system was 
131 mV after five cycles (Fig. 2B and fig. S10). 
Moreover, the thermopower (i.e., AV/AT7) of 
O,-WO;/TGC/S,-ZIS, driven by AD (the ther- 
modiffusion effect of K*, FeCN*’*, and H*), 
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AS (the enhanced solvent-dependent entropy 
difference of FeCN® /FeCN* and H*/H,), and 
AC (the concentration difference of FeCN®, 
FeCN*, and H* between cold and hot sides), 
was 8.2 mV K”™ and 3.0 times as high as that 
of TGC (2.7 mV K”), which was driven only 
by AD and AS contributions (78) (Fig. 2C and 
supplementary note S1). For O,-WO;/TGC/S,- 
ZIS, only 8% of the enhanced thermopower 
was contributed by AD and AS versus 92% 
by AC (figs. S11 and S12 and table S1). The 
optimized amounts of FeCN®, FeCN*, O- 
WOs, and S,-ZIS that resulted in the largest Vo. 
values were 0.26 mol liter”, 0.34 mol liter™, 
3 mg, and 2.5 mg, respectively (fig. $13). The 
thermopower and light intensity were posi- 
tively correlated (fig. S14). The thermopower 


was also dependent on the location of the 
photocatalysts (fig. S15). 

The short-circuit current density, the maxi- 
mum power density (Pax), and the normalized 
power density (Pmax/AT”) of O,-WO./TGC/S,- 
ZIS were ~'70 Am”, 2398 mW m~, and 8.5 mW 
m~ K®, respectively (Fig. 2D). Furthermore, 
the figures of merit (ZT) and Carnot-relative 
efficiency (n,) for O,-WO;/TGC/S,-ZIS were 
calculated to be 0.17 and 4.91%, respectively, 
whereas the values for TGC were much lower, 
at 0.02 and 0.47%, respectively. The thermal 
and electrical conductivities of the systems 
with and without catalysts were unchanged 
(fig. S16). The Hy and O, photoproduction 
rates of the O,-WO;/TGC/S,-ZIS system were 
11.3 and 5.5 umol hour ”, respectively (Fig. 2E). 
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Fig. 4. A large-area photocatalytically enhanced TGC. (A) Schematic drawing 
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conditions: 27 mg of O\-WO3, 22.5 mg of S,-ZIS, 18 ml of pure water, 100 mW cm 


light irradiation). Error bars indicate the standard deviati 
ments. (D) Schematic drawing of a large-area photocata 


180 isotope-labeled photocatalytic measure- 
ments demonstrated that the detected O2 was 
the product of water splitting (figs. S17 and S18). 

The O,-WO;/TGC/S,-ZIS system demonstrated 
a photocatalytically enhanced thermopower of 
8.2 mV K”™ versus that of other reported TGCs 
(Fig. 2F and table $2) (1, 2, 8, 9, 15, 16, 18, 19, 28-42). 
The normalized power density of 8.5 mW m~ 
K” also exceeded that of other TGCs. The STH 
energy conversion efficiency was 0.4% (table 
$3), comparable to that of other reported photo- 
catalysts with aqueous redox mediators (table S4). 


Validating the working principle 


A working principle for this photocatalytically 
enhanced thermopower of the system was pro- 
posed (Fig. 3A). The band structures for the 
O,-WOz and S,-ZIS photocatalysts were de- 
termined using UV-vis diffuse reflectance spec- 
troscopy and UV photoelectron spectra (fig. 
$19). Under light illumination, photogener- 
ated electrons with sufficient energy were ex- 
cited from the valence band maximum (VBM) 
of O,-WOs; and S,-ZIS to the conduction band 
minimum (CBM) of O,-WOs and S,-ZIS, re- 
spectively, and holes were generated on the 
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VBM of O,-WOs; and S,-ZIS (Fig. 3, B and C). 
The electrons in the CBM of O,-WOs; on the 
hot side facilitated the forward reaction from 
FeCN* to FeCN*” because the CBM of O,-WO3 
was higher than the redox potential of FeCn*/ ae 
resulting in a high concentration of FeCN* 
ions (Fig. 3B). The band alignment of O,-WO; 
and CoO, cocatalysts allowed holes to be effi- 
ciently extracted from the VBM of O,-WO3 to 
CoO, cocatalysts through the built-in electric 
field at the interface, which drove oxygen pro- 
duction (figs. S19 and S20). On the cold side, 
the holes in the VBM of S,-ZIS increased the 
amount of FeCN® ions by converting FeCN* 
to FeCN® ions (the VBM of S,-ZIS was lower 
than the redox potential of FeCN* / >) (Fig. 3C). 
The Pt cocatalysts served as an electron trap 
and attracted electrons from the CBM of S,- 
ZIS through a Schottky junction (Pt/S,-ZIS) 
and substantially facilitated H, production 
(figs. S19 and S20). As the O, and H, evolution 
reactions proceeded, H* and OH™ were gen- 
erated on the hot and cold sides of the system, 
respectively (Fig. 3, B and C). Because of the 
protonation and deprotonation processes in 
the PAA matrix and the H* thermodiffusion, 


thermogalvanic device with 36 units in series (112 cm) under the natural sunlight 
condition. (E) Voltage (black), power (green), solar intensity (red), and amount 
of evolved gases for a large-area photocatalytically enhanced thermogalvanic 
device with 36 units in series (112 cm~*) under the natural sunlight condition from 
10:00 to 16:00 (7 July 2022) at Northwestern Polytechnical University of Xi'an 
(reaction conditions: 108 mg Oy-WO3, 90 mg of S,-ZIS, 72 ml of pure water). Error 
bars indicate the standard deviation of the three independent measurements on 


only a small AC of H* occurred between the 
hot and cold sides of the system (fig. S21 and 
supplementary note S2). 

A thermogalvanic reaction also occurred in 
the system (Fig. 3A). Because FeCN“ ions have 
a higher charge density and can form a more 
condensed hydration shell, they exhibited lower 
thermodynamic entropy compared with FeCN* 
ions (2). When a temperature gradient of 16.8 K 
was formed under light illumination, the high — 
local concentration of FeCN* near the hot 
electrode thermodynamically enhanced the 
oxidation reaction FeCN*-—+FeCN* + e~ and 
resulted in the transfer of more electrons to 
the hot electrode (Fig. 3B). Similarly, the high 
local concentration of FeCN®” near the cold 
electrode thermodynamically enhanced the 
reduction reaction FeCN*® + e-—>FeCN* and 
attracted more electrons from the cold elec- 
trode (Fig. 3C). This continuous reaction fa- 
cilitated the generation of a high voltage. The 
concentration gradient of H* that formed 
within the system had an opposing effect on 
the increase in thermopower of FeCN*’ 1S (Fig. 
3A). Because of its small AC, the contribution of 
H* was much less than that of FeCN*/* (figs. 
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S11 and S12). Therefore, the high local concen- 
tration of FeCN* and FeCN® (i.e., extended 
AC) induced by photocatalytic reaction sup- 
ported the increased thermopower of O,-WO;/ 
TGC/S,-ZIS. By coupling the AD, AS, and AC of 
FeCN* and FeCN®, the high thermopower of 
the system was achieved. 

On the basis of the above work principle and 
the Nernst equation between the thermopower 
and redox concentration (15), a universal theo- 
retical function relation between thermopower 
(S.) and H, production rate (7) could be estab- 
lished for photocatalytically enhanced TGCs 
(supplementary note S3) 


4—/3. R 3. 
Se = Sap + SN + — (in HN — In cH” ) 


2R 1 1 + 
Thot + T 4 = )e+st 
NFAT ( hot cold ) ( een Ve Ren’ 7) le 


(1) 


where Sap is the thermodiffusion thermo- 
power of mobile ions (i.e., K*, FeCN*/*, H*) 
and oo is the thermopower driven by 
only AS of FeCN* and FeCN*. S® is thermo- 
power driven by the H* concentration gradient. 
V.. is the volume of the electrolyte solution of 
the hydrogen-evolution photocatalyst. CF” 
and CFC’ are the initial concentrations of 
FeCN* and FeCN”, respectively. F, n, and R are 
the Faraday constant, the number of electrons 
transferred during a redox reaction, and the 
ideal gas constant, respectively, and Tho: and 
Teoiq are the temperatures of the hot and cold 
electrodes, respectively. These results showed 
a linear relation between the thermopower 
and H, evolution rate. 

In our photocatalytically enhanced thermo- 
galvanic system, Eq. 2 was derived from Eq. 1 by 
substituting the corresponding experimental 
parameter values (fig. S22) 


Se = 2.7+ 0.32 (2) 


To validate the formula’s universality, a series 
of alternative photocatalysts with different 
photocatalytic properties, including BiVO,/ 
TGC/ZrO.-TaON and Cs-WO;/TGC/SrTiO;:Rh, 
were measured under the same experimental 
parameters in terms of thermopowers and Hy 
evolution rates (fig. S23). Along with the en- 
hanced H, production rates, their thermopow- 
ers revealed the linear increase and conformed 
to Eq. 2 (Fig. 3D). These results convincingly 
proved the universality of our strategy to serve 
as essential design principles for photocatalyt- 
ically enhanced thermogalvanic devices. 


Large-area thermogalvanic devices 


A large prototype module (28 cm?) containing 
nine units with a series connection was pre- 
pared that could reach a maximum V,,. of 1.2 V 
under 100 mW cm”® light irradiation (Fig. 4, 
A and B). After 3 hours of irradiation, H, and 
O, production reached 98 and 48 umol, re- 
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spectively (Fig. 4C). Outdoor experiments with 
the device were performed under natural sun- 
light. An array of the O,-WO;/TGC/S,-ZIS mod- 
ules, with an area of 112 cm”, was assembled 
with 36 units in series that self-floated on 
flowing water (Fig. 4D). The natural sunlight 
presented a time-dependent variability in solar 
intensity and ambient temperature from 10:00 
(the system reached a relatively stable state 
at this time) to 16:00 (Xi’an, 7 July 2022) (fig. 
$24). A V,, value of 4.4 V and a power value of 
20.1 mW were generated, indicating the prac- 
tical application of the photocatalytically en- 
hanced thermogalvanic technology (movie S1). 
After 6 hours of reaction, 0.5 mmol of Hz and 
0.2 mmol of O. were collected (Fig. 4E). The 
prototype system demonstrated a practical and 
sustainable way to generate electricity with 
Hz, and O, production simultaneously. 


Conclusion 


The photocatalytically enhanced thermogal- 
vanic devices were demonstrated through an 
in situ-induced photocatalytic process that 
produced a continuous concentration gradi- 
ent AC of FeCN* and FeCN® ions on both hot 
and cold sides. The system displayed a photocat- 
alytically enhanced thermopower of 8.2 mV K* 
accompanied by simultaneous solar-driven 
water splitting with an STH efficiency of up to 
0.4%. This pioneering system combines elec- 
tricity generation with Hz and O, production 
by harnessing energy from solar radiation. This 
work has also demonstrated the viability of the 
technology at a larger scale and under natural 
conditions, making it a promising method for 
diverse environmental energy conversion using 
solar-thermal energy. 
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The precursory phase of large earthquakes 


Quentin Bletery’* and Jean-Mathieu Nocquet? 


The existence of an observable precursory phase of slip on the fault before large earthquakes has 
been debated for decades. Although observations preceding several large earthquakes have been 
proposed as possible indicators of precursory slip, these observations do not directly precede 
earthquakes, are not seen before most events, and are also commonly observed without being followed 
by earthquakes. We conducted a global search for short-term precursory slip in GPS data. We 
summed the displacements measured by 3026 high-rate GPS time series—projected onto the directions 
expected from precursory slip at the hypocenter—during 48 hours before 90 (moment magnitude 

27) earthquakes. Our approach reveals a =2-hour-long exponential acceleration of slip before the 
ruptures, suggesting that large earthquakes start with a precursory phase of slip, which improvements in 
measurement precision and density could more effectively detect and possibly monitor. 


etecting precursors to natural disasters 

is key for predicting those events and 

minimizing human and economic losses. 

The search for earthquake precursors has 

been a long-standing pursuit, with much 
hope being placed in the concept of earthquake 
prediction in the early 1970s (1). The potential 
for earthquake prediction was later seriously 
reassessed when theoretical studies suggested 
that earthquakes are nonlinear processes that 
are highly sensitive to unmeasurably fine de- 
tails of the physical conditions at depth (2, 3). 
In the past decade, the idea has grown that large 
earthquakes initiate with a potentially observ- 
able slow aseismic phase of slip on the fault, 
associated with increased microseismicity (4-8). 
On the basis of either geodetic or seismic data, 
these studies suggest that earthquake precur- 
sors exist and that therefore earthquakes could 
be anticipated minutes (4), days (5, 7, 15, 18), 
weeks (6), months (7-12), or even years (13) 
before they occur. 

Nevertheless, all these analyses are based on 
records preceding only a few earthquakes, 
strongly limiting the generalization of the 
observation. Moreover, slow aseismic slip events 
associated with increased microseismicity are 
routinely observed and most of the time do 
not precede a large earthquake (19-24), which 
further calls into question the causal relationship 
between these proposed precursory signals and 
the earthquakes. Another critical point is that 
these observations on natural faults do not 
show a continuous process culminating in the 
earthquake. Indeed, whether the observations 
come from geodetic or seismic data, they show 
evidence of a slow slip or a microseismic crisis 
that usually stops days or weeks before the 
catastrophic event (4-6, 8-18). None of these 
observations show an exponential buildup of 
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the aseismic slip leading to the rupture, which 
is expected from laboratory experiments (25-28) 
and numerical models (29-37). One exception is 
a global analysis of the seismicity preceding 
large earthquakes, which does find an expo- 
nential increase in the number of earthquakes 
ranging from years up to hours preceding 
large events (7). 


Global stack of high-rate GPS data preceding 
large earthquakes 


We investigated the existence of precursory 
signals in high-rate (5-min) GPS data recorded 
in the 48 hours preceding moment magnitude 
(M,,) = 7.0 earthquakes worldwide (Fig. 1). We 
quantitatively test the hypothesis that earth- 
quakes start with a precursory phase of slow 
aseismic slip at the location of the hypocenter 
of the forthcoming event. We calculate the 
expected displacements measured by GPS sta- 
tions induced by such precursory slip (32). For 
each earthquake, 7, for each station, j, and at 
each time step, ¢, we then calculate the dot 
product of the observed horizontal displace- 
ment, w;;(t), with the horizontal displace- 
ment, 2,7, expected from a unit precursory 
slip in the direction of the impending earth- 
quake. If the observation is consistent with 
precursory slip—that is, if w,7)(¢) and g,} have 
similar orientations—then the dot product 
will be positive. If GPS data do not contain 
any signal related to precursory slip, the dot 
product, w;7(t) - 2,3, is equally likely to be pos- 
itive or negative. 

Using the global catalog of GPS data pro- 
cessed by the Nevada Geodetic Laboratory (33), 
we calculate this dot product for each earth- 
qake and for each station and then sum their 
contributions at each (5-min) time step (with 
respect to the earthquake origin times) to 
obtain the stack time series 


times, where o;,; is an estimate of the nv.2— 
amplitude at each station (32). Division by 
the square of the noise amplitude provides a 
weighted stack (34-36). The dot product with 
the expected displacement field, g;}, gives a 
greater weight to measurements at stations 
where larger displacement is expected from 
precursory slip—that is, at stations located close 
to the hypocenter of the upcoming earthquake. 
If GPS data do not contain any earthquake 
precursory signal, we expect S to exhibit no 
obvious trend. Coherent noise structures remi- 
niscent of colored noise in GPS data are expected 
to be strongly attenuated by the stack on multiple 
earthquakes, which should not share coherent 
noise patterns. Consistently, the distribution of 
S as a function of time shows no obvious co- 
herent pattern from 48 to 2 hours before the 
earthquakes (Fig. 2A). However, in the 2 hours 
preceding the events, the stack reveals a | 
positive trend, supporting the hypothesis of 
a growing slip in the hypocenter area (Fig. 2A). 


Statistical analysis of potential 
precursory signals 


To reduce the high-frequency noise level, we 
calculate a moving average using time win- ‘ 
dows of 1 hour and 50 min (Fig. 2B). We find 
that the maximum of the moving average is 
the last point (the average of the stack in the 
1 hour and 50 min preceding the earthquakes). * 
Its ratio to the maximum of the stack moving 
average in the last 2 days (excluding the latest 
1 hour and 50 min) is 1.82 (a moving median 
gives a slightly larger ratio of 2.1). The like- 
lihood that the last point of the moving average 
is the largest by chance is less than 0.2% (32). 
The likelihood that the last point of the moving 
average is twice as large as the maximum on 
the [-48, —-2]-hour time period is much smaller. 
The ratio between the last point of the moving < 
average and the standard deviation of the stack ‘ 
moving average in the last 2 days (excluding the 
latest 1 hour and 50 min) provides an estimate 
of the signal-to-noise ratio and is equal to 3.85 . 
(3.9 with a moving median). Moreover, we find 
that the last 23 points of the moving average 
monotonically increase and that the last 7 points 
exceed the maximum in the [-48, —2]-hour 
period. This means that these last 7 points of the 
moving average are larger than all values in the 
48 hours before them. We perform the analy- 
sis on 100,000 random time windows of GPS 
data not preceding earthquakes. The last point 
of the moving average exceeds 1.82, and the 
23 last points monotonically increase (the 
value found before the earthquakes) for only 
0.03% of the drawn samples (32), providing 
a rough estimate of the likelihood that the 
signal we observe arises from noise. 

S is well fitted by an exponential function of 
time constant t = 1.3 hours (Fig. 2C). The misfit 
reduction of the fitted exponential function in 
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Fig. 1. Earthquakes and GPS stations used in the study. (Top) Distribution 
and focal mechanisms (beachball plots) of the 90 My, = 7 earthquakes with 

2 days of 5-min GPS records (with no gap and no noticeable foreshock) available 
within a 500-km radius of the epicenters. Mechanism sizes are indicative of event 


the last 1 hour and 50 min is 79% (32), mean- 
ing that 79% of the signal in the last 1 hour and 
50 min of S is explained by an exponential 
function. To facilitate interpretation, S can be 
converted into cumulative moment of preslip 
through a simple coefficient of proportionality 
(32). The exponential fit in moment can then 
be seen as a template of precursory (cumulative) 
moment release on the fault, suggesting that on 
average, earthquakes have an exponential-like 
precursory phase as predicted by laboratory 
experiments (25-28) and dynamic models 
(29-31). The average cumulative moment ob- 
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tained before the rupture (the last point in Fig. 
2C) is 3.9 x 10'8 N-m, corresponding to a My 
of 6.3. Such a magnitude and duration locate 
precursory slip in the observation gap of fault- 
slip phenomena between slow aseismic slip 
and earthquakes (37). A more subtle, but no- 
ticeable, feature in S is a long-period oscilla- 
tion. The best sinusoidal fit to S (Fig. 2D) gives 
a period of 12.9 hours, close to the period of 
tides (12.4 hours). However, the misfit reduc- 
tion of the fitted sinusoidal function is only 
10%, making the sinusoidal signal in S much 
less obvious than the exponential one. 
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magnitudes. Colors indicate the number of time series available for each event. 
(Bottom) Distribution of the 3026 GPS stations with complete records in the 2 days 
preceding the 90 earthquakes shown above (the earthquake list is given in table S1). 
(Insets) Enlarged subpanels show areas of high station concentration. 


To test whether the observed signals are 
related to fault slip in the area of the forth- 
coming earthquakes, we replace g;7 with unit 
vectors pointing to arbitrary fixed directions; 
for simplicity, we use the east and north di- 
rections (32). The stack we obtain shows no 
signal similar to what we observe in the last 
2 hours of S, nor long-period oscillation (fig. 
S1). This rules out that the shape of S results 
from a spatially correlated common-mode 
error in GPS data and strongly supports that 
the source of the signals we observed in S is 
related to processes taking place in the direct 
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Fig. 2. Global stack in the direction of expected slip. (A) Global stack S of 
3026 time series recorded before 90 earthquakes as a function of time relative 
to each earthquake origin time. (B) A 1-hour-and-50-min moving average of S 
normalized by its standard deviation on the [-48-hour, —1-hour-50-min] time 
period, superimposed on S. The upper horizontal dashed line indicates the 
maximum of the moving average (excluding the last 1 hour and 50 min). The 
lower horizontal dashed line indicates the O base line (above which observations 
are consistent with precursory slip). The vertical dashed line indicates the 
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time after which the stack gives only positive values (1 hour and 55 min). 
The last point of the moving average is 1.82 times as large as the maximum 
on the [-48-hour, -1-hour-50-min] time period and 3.85 times as large 

as the standard deviation, providing a rough estimate of the signal-to-noise 
ratio. (C) Stack converted into moment (supplementary materials) 

with best exponential fit superimposed (time constant t = 1.3 hours). 

(D) Stack in moment with best sinusoidal fit superimposed (period 

T = 1.9 hours). 
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Fig. 3. Stack in the direction of expected slip for Tohoku. (A) Stack of 355 time series recorded before the Tohoku-Oki earthquake. (B) Same as (A) 
converted in moment release with best sinusoidal fit superimposed (period Tyo = 3.6 hours). (©) Residual of the moment-release stack with the sinusoidal fit 
(blue dots) and best exponential fit (red curve, time constant tro = 1.5 hours). 


vicinity of the hypocenter of the impending 
earthquakes. 


The case of the Tohoku-Oki earthquake 


The Tohoku-Oki earthquake (2011, M,, 9.0) is 
the largest event in our dataset. The event 
was also recorded by the largest number of 
stations (355 full time series) and is one of the 
few events for which short-term precursory ac- 
tivity was suggested by microseismicity analy- 
ses (5). We show the dot product stack for 
the Tohoku-Oki earthquake alone, Syo, in the 
24 hours preceding the event (Fig. 3). Sto sug- 
gests precursory slip that is similar to S. It also 
reveals an unexpected but relatively clear sinu- 
soidal shape. As for the global stack, we verify 
that when replacing g,;7 with unit vectors 
pointing in the east and north directions, the 


Bletery et al., Science 381, 297-301 (2023) 


signal vanishes (fig. S2), strongly suggesting 
that this sinusoidal behavior is not related to 
GPS noise but rather is caused by processes 
taking place in the direct vicinity of the hypo- 
center of the Tohoku-Oki earthquake. 

We find that the best fit for Spo is a sinu- 
soidal function of period Tyo = 3.6 hours. The 
misfit reduction for the 24-hour time series is 
72%. We try to fit sinusoidal functions to dot 
product stacks calculated at 833 randomly 
selected 48-hour-long time windows and find 
0 fit for which the misfit reduction is as high 
at periods below 12 hours (32). We also try to 
fit sinusoidal functions to dot product stacks 
calculated by changing the location of the syn- 
thetic source (32) and find 0 source locations 
that give a misfit reduction as high as for the 
location of the Tohoku earthquake (fig. S11A). 


21 July 2023 


This means that, exploring both time and 
space, the most periodic signal obtained in 
Sro is found just before the event, consid- 
ering a source located at the location of the 
hypocenter (32). We do not believe that sinu- 
soidal slip has been observed on natural faults 
before, but similar phenomena have been ob- 
served before glacier breakoff (38, 39). More 
precisely, glacier precursory signals are log 
periodic, meaning that the oscillation period 
decreases when getting closer to the rupture 
and the amplitude increases (38, 39). Log- 
periodic precursory activity also arises from 
earthquake-rupture models (40, 41). As for the 
global stack, we convert S;o into moment, 
which can be seen as an (integrated) precursory 
source time function. The amplitude of the 
fitted sinusoid is 1.0 x 10'° N-m, corresponding 
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to a M,, of 6.6. This large hypothetical precur- 
sory oscillation resembles the resonance ef- 
fect predicted by rate-and-state friction laws 
when the fault approaches its critical state 
(42). The residual of Sto with the sinusoid can 
be fitted by an exponential of time constant 
Tyo = 1.5 hours, which is similar to the time 
constant of the global stack. The associated 
cumulative moment release is 2.9 x 10’? N-m, 
corresponding to a M,, of 6.9. 


Contributions of individual earthquakes 


Because the signal is large in Spo and has a 
potentially large weight in S, we verify that 
when removing Sro from the stack, the signal 
is still present (fig. S3). We generalize the pro- 
cess and evaluate the relative contribution of 
each earthquake in the signal observed in the 
last 2 hours and in the overall stack (32). We 
find that the signal is not overly dominated by 
one or a few earthquakes even though we see 
larger contributions from earthquakes recorded 
by many stations and by stations located close 
to the source of the impending earthquakes 
(fig. S4). In details, 52 earthquakes (58% of the 
total) contribute positively to the global stack 
during the last 2 hours, but these 52 earth- 
quakes represent 2235 time series (74% of 
the time series) (32). Additionally, we calcu- 
late the average of the last 2 hours for the 
stacks on all earthquakes and find very simi- 
lar figures: Fifty-four earthquakes (60%) have 
a positive mean in the last 2 hours of the stack, 
but these 54 earthquakes represent 2251 time 
series (74% of the total). 


Discussion 


Because the exponential function is always 
positive and monotonically increasing, a poten- 
tial exponential acceleration of slip in the di- 
rection of the upcoming coseismic slip would 
sum constructively, making it likely to appear 
in the global stack. To the contrary, the oscillation 
properties of the sinusoidal function make a 
potential sinusoidal preslip unlikely to appear 
in a multiearthquake stack. Nonetheless, the 
global stack exhibits a weak sinusoidal signal. 
Even though the misfit reduction provided by 
the sinusoidal fit is only 10%, the best-fitting 
function has two interesting properties: (i) Its 
period (12.9 hours) is very close to the period 
of tides (12.4 hours), and (ii) its value at the 
earthquake origin time is close to its maximum 
(Fig. 2D). These two properties can potentially 
explain how stacked oscillations could interfere 
positively: A common excitation source could 
result in a common excited period, and an 
earthquake triggering at the most favorable 
time could explain the absence of phase lag. 
Correlations have been observed between tides 
and microseismicity (43, 44), suggesting tidal 
modulation of slow aseismic slip (45, 46). Cor- 
relations between tides and earthquakes have 
also been observed in time periods preceding 
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large earthquakes, suggesting that when the 
faults are approaching the critical stage of 
failure, tidal loading may initiate a rupture 
(47, 48). A 12-hour oscillation on the faults in 
the days preceding the events is therefore phy- 
sically consistent with a tidal excitation of the 
system, possibly enhanced by a resonance effect 
(42), as the faults reach their critical state. 


Conclusions 


Our analysis indicates that, on average, earth- 
quakes start with a ~2-hour-long exponential- 
like acceleration of slow slip. Analysis of 
foreshock activity also suggests exponential 
acceleration of fault slip but over a much wider 
range of timescales (7). The observation we 
make on GPS time series might be the very 
end of much a longer process of precursory 
slip. Although present instrumental capacities 
do not allow us to identify precursory slip at 
the scale of individual earthquakes, our obser- 
vation suggests that precursory signals exist 
and that the precision required to monitor 
them is not orders of magnitudes away from 
our present capabilities. 
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FLUORINE CHEMISTRY 


Fluorochemicals from fluorspar via a phosphate-enabled 
mechanochemical process that bypasses HF 
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Francesco Ibba**, Job Struijs', Mathias A. Ellwanger’, Robert Paton®, Duncan L. Browne’, 
Gabriele Pupo™’, Simon Aldridge’, Michael A. Hayward**, Véronique Gouverneur’* 


All fluorochemicals—including elemental fluorine and nucleophilic, electrophilic, and radical fluorinating 
reagents—are prepared from hydrogen fluoride (HF). This highly toxic and corrosive gas is produced by the 
reaction of acid-grade fluorspar (>97% CaF>) with sulfuric acid under harsh conditions. The use of fluorspar to 
produce fluorochemicals via a process that bypasses HF is highly desirable but remains an unsolved problem 
because of the prohibitive insolubility of CaF>. Inspired by calcium phosphate biomineralization, we herein 
disclose a protocol of treating acid-grade fluorspar with dipotassium hydrogen phosphate (K2HPO,) under 
mechanochemical conditions. The process affords a solid composed of crystalline K3(HPO.)F and 
K2_,Ca,(PO3F),(PO,),, which is found suitable for forging sulfur-fluorine and carbon-fluorine bonds. 


luorochemicals have a wide range of 

applications in the metallurgical indus- 

try, Li-ion batteries, electronics, fluoro- 

polymers, refrigerants, agrochemicals, 

and pharmaceuticals (J, 2). All fluorine 
atoms incorporated into fluorochemicals, in- 
cluding nucleophilic, electrophilic, and radical 
fluorinating reagents, originate from naturally 
occurring fluorspar (calcium fluoride, or CaF9). 
For the production of fluorochemicals, this 
mineral must be converted into hydrogen 
fluoride (HF) (3, 4), a process first reported by 
C. W. Scheele in 1771 (5) (Fig. 1A). Today, current 
practice in industry still relies on this energy- 
intensive process, which entails the reaction 
of acid-grade fluorspar (acidspar, >97% CaF») 
with sulfuric acid at elevated temperatures to 
generate HF; the HF is then stored as liquefied 
gas or used as an aqueous solution (6). Safety 
is a primary concern because HF is highly toxic 
and must therefore be handled with extreme 
caution. Despite stringent safety guidelines, HF 
spills have occurred, some with fatal accidents 
and detrimental impacts on the environment 
(7). Our research ambition is to rejuvenate fluo- 
rine chemistry with current global challenges 
in mind, through the invention of safe and sus- 
tainable fluorination methods of nonpersistent 
fluorochemicals. A paradigm shift for academia 
and industry would be to access essential 
fluorochemicals directly from fluorspar, there- 
by avoiding the production of HF and thus 
decreasing energy requirements and streamlin- 
ing the current high-maintenance supply chains. 
The challenge is considerable because CaF, 
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chemistry is viewed as inaccessible because 
of its high lattice energy (AU;, = 2640 kJ mol’) 
and prohibitive insolubility in organic solvents 
(8). Herein, we disclose a solution to this long- 
standing challenge and report that the activa- 
tion of acid-grade fluorspar with a potassium 
phosphate salt under mechanochemical condi- 
tions affords a fluorinating reagent for direct S-F 
and C(sp?/sp”)-F bond construction (Fig. 1B). 
CaF, (melting point of ~1420°C) is a white 
solid that is poorly soluble in water (0.016 g liter 
at 20°C) and insoluble in organic solvents (9). 
To date, limited chemistry is known for the 
production of fluorochemicals using CaF5. Rare 
examples report its use in the synthesis of 
LiPF¢, PF;, POF, or Ca(SO3F)2 under extremely 
harsh conditions (J0-12). Synthetic porous 
CaF, obtained from soda lime and HF was 
reported for the conversion of a-chloro ethers 
to a-fluoro ethers at 200°C (73). Soluble CaF, 
complexes have also been prepared and char- 
acterized, but there is no report on their use 
for organic fluorination reactions (14, 15). As 
part of our studies on alkali metal fluorides 
for asymmetric fluorinations (J6, 17), we expanded 
our interest to CaF, with the ultimate aim being 
to use acid-grade fluorspar (>97% CaF.) as a 
fluoride source for the preparation of fluoro- 
chemicals. For direct fluorination with CaF), 
we considered the formation of a calcium by- 
product with a lattice energy greater than 
2640 kJ mol’ as a thermodynamic driving 
force (8, 18). Calcium phosphate (bio)minerali- 
zation is essential to the formation of bones 
and teeth, and other pathological calcifications 
(79), and served as inspiration for initial in- 
vestigation. Specifically, we conceived a study 
probing the reactivity of CaF, in the presence 
of inorganic phosphate salts. In this scenario, 
one possible calcium by-product that is formed 
upon displacing fluoride from CaF, with phos- 
phate ions is Cas(PO,). (AU;, = 3534 kJ mol? 
per Ca ion). Exploratory experiments combin- 
ing CaF, with phosphate salts and various sub- 


q 


strates under a range of conditions gave q ie 


amounts of product (table S1). Attempted 

mization revealed that solution-phase chemis- 
try had a poor prognosis for improvement, 
prompting a changeover to solid-state chemis- 
try. Mechanochemical ball milling was attractive 
as a promising technology that enables trans- 
formations independent of reactant solubility 
and aids solid-state diffusion kinetics (20-24). 
The knowledge that doping fluorite-type com- 
pounds with monovalent cations can improve 
(even if marginally) fluoride mobility in solid 
electrolytes for fluoride-ion batteries encouraged 
us to explore K* phosphate salts to activate CaF, 
in the solid state (25). Ion metathesis would then 
release KF (or a derivative thereof), a commonly 
used nucleophilic fluorinating reagent. 

Initial experimentation focused on S-F bond 
formation. Sulfur(VI) fluoride exchange (SuFEx) 
is a powerful click reaction with applications 
in chemical biology and materials science (26). 
Moreover, sulfonyl fluorides are commonly used 
as fluorinating reagents (27) and are more stable 
than common precursor sulfonyl chlorides, 
thereby offering a modest contribution to com- 
pensate for the energetic penalty incurred upon 
CaF, dissociation. For reference, the homolytic 
bond dissociation energy of the S-F bond in ‘ 
SOF» (379 + 18 kJ mol’) is larger than that of 
the S-Cl bond in SO.Cl, (192 + 17 kJ mol”) 
(26). Exploratory experiments were conducted 
in a stainless-steel milling jar (15 ml) at 30 Hz + 
using one stainless-steel ball (4 g). The re- 
action of 4-toluenesulfonyl chloride (TsCl) 
with CaF, (5 equiv) for 1 hour did not afford 
4-toluenesulfonyl fluoride (TsF, 1); a control ex- 
periment replacing CaF, with KF (1.1 equiv) gave 
full conversion (>95%) (table S2). Gratifyingly, the 
use of CaF, (5 equiv) in the presence of K3PO4 
(2 equiv) or KzHPO, (2 equiv) led to the forma- 
tion of TsF (1) in 7 and 17% yields [as measured 
by °F nuclear magnetic resonance (NMR) spec- « 
troscopy], respectively. However, KH»PO, was ‘ 
ineffective (table S3). Milling CaF, (4 equiv) and 
K,HPO, (2 equiv) for 3 hours before addition 
of TsCl, and further milling of this mixture for . 
3 hours, gave 66% of 1 with no recovery of start- 
ing material (table S6). The finding that partial 
degradation of both TsCl and TsF took place 
under these conditions led to a refined protocol 
involving milling CaF, (4 equiv) with K,HPO, 
(4 equiv) and using the resulting powder for the 
fluorination of TsCl in solution (‘BUOH, 0.25 M; 
‘Bu, tert-butyl) at 100°C (tables $7 to S12). 

This method afforded 1 in 81% yield (°F 
NMR yield). Gratifyingly, replacement of syn- 
thetic reagent-grade CaF, with acid-grade fluor- 
spar was equally effective (table S13). The 
sequential milling of acid-grade fluorspar 
(1 equiv) with three portions of KzHPO, (overall 
2.5 equiv) at 35 Hz gave a fluorinating reagent 
(Fluoromix) of improved reactivity in the pres- 
ence of H,O (2 equiv), affording 1 isolated in 
86% yield using one instead of four equivalents 
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--- A Industrial Route to Fluorochemicals (Complex supply chain via toxic & corrosive HF) 


Reaction reported by Scheele (1771) 


H2SO, (300 °C) 
Fluorspar (CaF) 
(Acidspar > 97% purity) 


AU, = 2640 kJ mo!" 
| 


-- B Fluorochemicals from Acidspar bypassing HF (Challenging & unsolved) - This Work ---- 


Mechanochemical activation 
with a phosphate salt (solid-state) 


KyH3.,PO4 


(, ») 


W 7] 


Inspired by biomineralization 


Ca3(PO,)2 
calcium phosphate 


species 


Ka(HPO,)F 

; + ‘ 

' KoxCay(PO3F)a(PO4)> | Fluorination 

: e-- 
reactive fluorinating in solution 


Cas(PO,)3(OH) 
hydroxyapatite 
AU; = 10602 kd mol! AU, = 17046 kJ mol" 


Electrophilic Nucleophilic 
Reagents F* Reagents F~ 
\drogen e.g. Fo e.g. HF complexes —»  Fluorochemicals 
porice Selectfluor™ DAST, PyFluor 
RoN—F MF (KF, CsF) 


S—F and C(sp?/sp?)—F bond formation 


yh Z7~cn 


deoxyfluorinating valuable 
reagents building blocks 
O 


HO 


bioactive scaffolds 


over 50 fluorochemicals 


M Cheap fluoride source M Operationally simple 
M Safe to handle fluoride & phosphate salt 


Fig. 1. Synthesis of fluorochemicals from fluorspar (CaF2). (A) Current industrial route to fluorochemicals via hydrogen fluoride. (B) Synthesis of an inorganic 
fluorinating reagent upon treatment of acid-grade fluorspar with a phosphate salt under mechanochemical conditions and applications to monofluorinated chemicals 


(this work). DAST, diethylaminosulfur trifluoride. 


of CaF, (figs. S1 to S4). This optimized pro- 
tocol afforded various sulfonyl fluorides of 
importance in medicinal chemistry, chemical 
biology, and materials science with yields up 
to 98% (Fig. 2). The scope includes the mul- 
tipurpose fluorochemical ethenesulfony] flu- 
oride (ESF, 19), antibiotic pharmacophore NBSF 
(10), enzyme inhibitors (20 to 25) (28), and 
deoxyfluorination reagents PyFluor (17) and 
SulfoxFluor (18) (29, 30). We also examined the 
possibility of C(sp*)-F bond formation using 
Fluoromix. These reactions were best per- 
formed in the presence of 18-crown-6. A range 
of benzylic and alkyl fluorides, o-fluoroketones, 
a-fluoroesters, and a-fluoroamides were pre- 
pared in yields up to 91% (26 to 45). As a case 
study for C(sp?)-F bond formation, we se- 
lected (hetero)aryl chlorides, which underwent 
fluorination in dimethyl sulfoxide (DMSO) in 
modest yields (46 to 51), affording (hetero)aryl 
fluorides, which are valuable building blocks 
for pharmaceuticals and agrochemicals (J). 
Mechanistic studies gave insight on the 
composition of Fluoromix and how it serves 
as a fluorinating reagent. For the identifica- 
tion of the water-soluble species, a sample of 
Fluoromix was stirred in D,O. Centrifugation 
followed by °F NMR analysis of the super- 
natant showed a signal at -121.9 parts per 


million (ppm) assigned to fluoride (fig. $9). 
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A second '’F peak was observed and assigned 
as FPO,” [chemical shift (8) = —73.8 ppm, and 
coupling constant (‘Jp_y) = 864 Hz]. A signal 
at 5 = 2.7 ppm identified as HPO,” , and the 
doublet diagnostic of FPO,” at 5 = 1.1 ppm 
(Jp_z = 864: Hz) was observed by 2"P NMR. The 
matter derived from ball milling CaF, with 
K,HPO, (Fluoromix) is stable (fig. S12) and 
amenable to ex situ analysis by powder x-ray 
diffraction (PXRD) to determine the compo- 
sition of the bulk crystalline phase (Fig. 3). 
Analysis revealed new crystalline phases that 
were identified as K;(HPO,)F and Ky_,Ca, 
(PO3F),(PO4), along with residual crystalline 
CaF,. No crystalline fluorapatite [Ca;(PO4)3F] 
was detected. In considering the structures 
of these new inorganic salts, we hypothesized 
that ion metathesis between CaF, and K,HPO, 
might occur to afford calcium hydrogen phos- 
phate (CaHPO,,) and potassium fluoride (KF), 
or derivatives thereof. With this in mind, mech- 
anistic experiments were carried out, which 
demonstrated that a new crystalline phase Xx) 
is formed upon ball milling of KF with K,HPO, 
(Fig. 3A). Xqx is present in Fluoromix, and is 
shown to be K3(HPO,)F (Fig. 3, B and C), which 
is isostructural to K3(PO3F)F and K3(SO,)F 
(31, 32). The reactivity of independently pre- 
pared X(x) was investigated using TsCl under 
optimized solution-phase conditions. Xx) 


proved to be a highly effective fluorinating re- 
agent comparable to Fluoromix itself (Fig. 3D). 
Further ball milling of Xqy (K3(HPO,)F) with 
CaHPO, afforded a new material Yoxcay, which 
is also present in Fluoromix (Fig. 3, A and B). 
Yea) contains both crystalline and amor- 
phous phases. The crystalline phase of Yuxcay 
has the proposed composition Ky_,Ca,(PO3F)q + 
(PO,),, featuring both K* and Ca”* (Fig. 3C), ° 
and is topologically closely related to the re- 
ported structure of K,;CaH(PO,)2 (33). Ko_,Cay 


(PO3F),(PO4), was independently generated . 


by ball milling CaHPO, sequentially with KF 
and then K,HPO,. We noted that the solid 
matter generated by milling CaHPO, with KF 
is amorphous and afforded the crystalline phase 
of Yoxcay upon milling with K,HPO, (figs. S19 
and $21). The “F NMR spectrum of Yea) 
in DO displays aresonance of FPO,” (6 = 
~73.9 ppm, and ‘Jp_y = 865 Hz), along with a 
signal attributed to fluoride (6 = -122.2 ppm) 
(fig. S18). As a fluorinating reagent, Yoxca) 
shows a level of performance that is markedly 
lower than that of Fluoromix (Fig. 3D). Col- 
lectively, these data provide insight into the 
composition and reactivity of Fluoromix and 
indicate that component X(q) is a superior 
fluorinating reagent to Yuxca)- 

This study presents a direct pathway to 
fluorochemicals from acid-grade fluorspar by 
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----Synthesis of Fluoromix 


KgHPO, (2.5 equiv)” 


, ) 
Fluorspar (CaF>) @ ——> Fluoromix 
(1 equiv) 
\ Z) 
Acid Grade 7 : 


35 Hz, 9 h (total) 
1x 7g ball, 15 mL jar (stainless-steel) 


--- S—-F Bond Scope 


Fluoromix (1 equiv) 
H20 (2 equiv) 
@———>>_ R—-SO2F 
tBUOH (0.25 M) 
25-100 °C, 1-48 h 


R—SO,Cl 


con Cems co” 
Me Br Me 


Powdered solid 
fluorinating reagent 


Solid reagents 


fees Ball milling at 35 Hz 
are placed in jar 


R =Ph, 5, 78% 
DN. SO2F = AcHN, 6, 77% 
Br~Sg~>SO2F . =1,7, 78%" 


= NOz, 8, 73%t 


1, 86% 2, 98%T 3, 72%" 4, 73%t 
,--- Deoxyfluorinating Reagents -- - -. 
F SO2F | OnNTS 
FO2S CO2Bn SO2F SO2 \ oF ~- SO2F S. ' 
ee CE OR Mma Acar sor sor CO Orr 
cl N NOp SO2F N ' 3 Cl : 
H ly ae 
9, 67%S 10, 50% 11, 55%1 12, 75% 13, 66% 14, 52%" 15, 83%" 16,47% 17, 54%** 18, 79%! : 
Bn-Furosemide NBSF H PyFluor SulfoxFluor | 
derivative antibiotic tee bandan wes wasuaee deeceueueas 4 
SO2F 
SOoF 2 CbzHN. go, 5 SO2F 
Za a Za Za S.~g9 F CO i 7 (= sO 
SOF ere ye 2) \Me MeO 
PhthN NMez i ¢ 
19. 72%" 20. 40%tt 21. 50%t 22, 72% 23, 45% 24, 26% 25, 50% 
: err ‘ Pith. Ser protease Building block for PMSF Me-4-FSB 
or L-26 Lipoprotein NER AEB SE inhibitor proteasome inhibitor Ser protease inhibitor GSTs inhibitor 


lipase inhibitor Ser protease inhibitor 


--- C(sp3)-F Bond Scope 


Fluoromix (2 equiv) 
H20 (5 equiv) 
18-crown-6 (1 equiv) 
@——> _ R-F 
tBuOH (0.25 M) 
60-100 °C, 5-24h 


R—Br 


26, 77% 27, 64% 


R = Me, 36, 55% 
= Et, 37, 54% 


38, 24%8: + 
Fluasterone derivative 


39, 85% 40, 91% 


--+ C(sp2)-F Bond Scope 


Fluoromix (3 equiv) 
Me,NCl (0.5 equiv) 
@—_—_ (Het)Ar—F 
DMSO (0.25 M) 
150-160 °C, 1-16 h 


od 
OoN Cl 


46, 48%"# 


(Het)Ar—Cl 


47, 61% 


Fig. 2. Scope of S-F bond and C-F bond formation. Scope of S-F bond 
formation (top) and C-F bond formation (bottom). All yields are for isolated 
products (0.5 mmol scale unless otherwise stated). *Anhydrous KazHPO, added 
to acid-grade fluorspar in three stages during ball milling (see fig. S6); TEtCN as 
solvent; tUsing 1.2 equiv of Fluoromix; §0.25 mmol scale; Using 2.2 equiv of 
Fluoromix; #°F NMR yields using 4-fluoroanisole as internal standard; **1,2-DCB 
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F : ss ; 
Pt go oy ay 
F Meo7\~ 
= O fo) R F 
R= NOp, 28, 72% 
= CO2Me, 29, 73% 


41, 52% 


aol N 
| | 
7~CN 


30, 42% R= OEt, 31,41%  R-=OrBu, 33,73%* 35, 53%* 
= NEtp, 32, 78% = NEty 34,81% —— Indaziflam 
precursor 
F OU~LF 0 
or o wor oe 
S F ¢ 
42, 67% S81 43, 47%" 44, 73% 45, 70% 


s @N s 
7™~CN N74 
48, 58% 49, 37% 50, 28% 51, 35% 


as solvent; +tYield over two steps, prepared by addition of trans, trans-farnesyl 
mercaptan (S7) to ESF (19); ¢tlsolated as a diastereomeric mixture (1:2 «:B); 
§§From R-I; 9Using 2.5 equiv of Fluoromix; ##Using 4.0 equiv of Fluoromix. 
Ac, acetyl; AEBSF, aminoethylbenzenesulfonyl fluoride; DCB, dichlorobenzene; 
Et, ethyl; FSB, fluorosulfonylbenzoic acid; GSTs glutathione S-transferases; 

Me, methyl; Ph, phenyl; Phth, phthalimide; PMSF, phenylmethanesulfony! fluoride. 
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---A Control EXDGNMENS <= <<= = 22 - sen niciem enema mm eincinimmainsicimenininmmimmimmm miei aimomiaimnonm 


KgHPO, (1 equiv) 


KF (1 equiv) 


---B PXxRD Patterns of Fluoromix, CaF2, X(k) & Y(kca) 


&o— Ww 
35 Hz, 3h 
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— Fo— Yuca 
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Fluoride Source 


orinating reagents, can be envisaged. CaF, may 
therefore become a direct source of fluoride 
for the production of fluorochemicals via a 
process that bypasses the production of HF. 
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Ultraflexible endovascular probes for brain recording 
through micrometer-scale vasculature 

Anqi Zhang’*, Emiri T. Mandeville’, Lijun Xu*, Creed M. Stary‘, Eng H. Lo®, Charles M. Lieber?*+ 


Implantable neuroelectronic interfaces have enabled advances in both fundamental research and 
treatment of neurological diseases but traditional intracranial depth electrodes require invasive surgery 
to place and can disrupt neural networks during implantation. We developed an ultrasmall and 

flexible endovascular neural probe that can be implanted into sub-100-micrometer-scale blood vessels 
in the brains of rodents without damaging the brain or vasculature. In vivo electrophysiology recording of 
local field potentials and single-unit spikes have been selectively achieved in the cortex and olfactory 
bulb. Histology analysis of the tissue interface showed minimal immune response and long-term stability. 
This platform technology can be readily extended as both research tools and medical devices for the 
detection and intervention of neurological diseases. 


euroelectronic interfaces establish com- 

munication between the brain and ex- 

ternal devices (7-3). Many such interfaces 

have been developed to gather and mod- 

ulate different forms of neural information, 
yet there is a clear trade-off between invasive- 
ness and spatial resolution. Nonpenetrating 
techniques such as electroencephalography 
(EEG) (4) and electrocorticography (5) are less 
invasive but lack the spatial resolution to tar- 
get individual neurons and are limited to re- 
cording from the brain surface. By contrast, 
invasive approaches such as depth electro- 
des (6-9) can achieve single-cell, single-spike 
resolution in deep brain regions, but open-skull 
implantation poses considerable risks (70), in- 
cluding intracortical bleeding and infection, as 
well as damage to the targeted brain regions. 
To overcome the trade-off of invasiveness and 


resolution, we report an ultraflexible micrometer- 
scale neuroelectronic interface that uses a native 
delivery system: the brain vasculature. The 
metabolically active central nervous system re- 
quires a dense vascular network, so the average 
neuron is less than 20 um from the nearest blood 
vessel (11, 12). This vasculature thus offers re- 
cording probes access to any brain region with- 
out damaging the recorded neural circuits. 
Endovascular recording of brain waves in 
millimeter-scale blood vessels has been dem- 
onstrated in previous studies (13-15). Endo- 
vascular EEG in humans (J6) was first achieved 
using a guided stainless-steel catheter with a 
15 mm x 0.6 mm electrode. Through the in- 
ternal carotid artery (ICA) in the neck, the 
catheter was advanced to the middle cerebral 
artery (MCA) in the brain (diameter ~2.9 mm) 
(17). The stentrode™ (Synchron) has an elec- 


> 


Mesh 


q 


trode array consisting of eight 0.75-mm ed 
trode discs on a self-expanding stent (18,2. 
The stent was inserted through the jugular 
vein in the neck into the superior sagittal 
sinus (diameter ~2.4 mm) in sheep. However, 
much of the brain remains inaccessible with 
these electrodes because metal-based catheters 
and stents are stiff and bulky, and navigating 
them through tortuous brain vasculature with 
vessels down to the micrometer scale can re- 
sult in tissue damage and inflammation. Other 
reports proposed the use of flexible devices 
to avoid tissue damage. For example, polymer- 
based wires with a magnetic head can be 
driven with flow and magnetic steering, 
demonstrated navigation inside microfluidic 
devices, and showed endovascular insertion in 
ex vivo rabbit ears (20). However, in vivo endo- 
vascular implantation and electrophysiology 
have not yet been achieved with such flexible 
devices. . 
We demonstrate ultraflexible micro-endovascular 
(MEV) probes that can be precisely delivered 
through the blood vessels in the neck into sub- 
100-micrometer scale vessels in rat brains (Fig. 
1A). In vivo electrophysiology recording of local 
field potentials and single-unit spikes has been 
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Guidewire deflection é 
25-ym 75-ym_F 


region 


Fig. 1. Endovascular implantation and overview of the MEV probe. 
(A) Schematic showing an MEV probe (yellow) implanted into a rat brain through 


recording electrodes (gray circles); (iii) SU-8 stem containing independent 
interconnects for each electrode; and (iv) the I/O pads consisting of a gold 


the blood vessels in the neck. The mesh region with electrodes is implanted region (yellow) connected by SU-8. (C) Tiled brightfield images of the MEV probe 


in deep cerebrovasculature whereas the input/output (1/0) region remains 
exteriorized for subsequent connection and measurement. (B) Schematic of 
the MEV probe with the ultraflexible mesh region at left tapering into the stem 
and I/O region at right. The middle line in red is the thick SU-8 guidewire 
layer. Insets provide magnified views of (i) probe tip containing the 25-um 
wide guidewire (red) and ultraflexible SU-8 mesh region (light gray); (ii) Pt 
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with a 25-4m wide guidewire. The overview and insets are the same size as 
the schematic regions in (B). The cross-section surface profiles of the probe are 
shown in fig. S2. (D) Side view of the calculated bending of the 18-mm-long 
guidewire in the device region under the same applied force (F = 10 nN). 
Deflection of the tip of the 25-um guidewire probe is 3 times that of the 75-um 
guidewire probe (Supplementary Text, fig. S4). 
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Fig. 2. Branch-selective implantation. (A) Schematics of endovascular 
implantation surgery. A microcatheter loaded with the MEV probe is inserted from 
the ECA opening to the MCA/ACA bifurcation. Saline carries the probe into the MCA 
or ACA. (B) Angles (n = 13) of the MCAs and the ACAs from the ICAs (with +1 
standard deviation, s.d.) (C) Radii of curvature and effective bending angles of the 
device region (n = 20, with +1 s.d.) in water. Insets are representative images of 
the probes with 25-wm and 75-um guidewires. (D) (Left) Schematic of a microfluidic 
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channel designed to mimic the MCA/ACA bifurcation. (Right) Representative 
implantation of a 25-um guidewire probe into the MCA channel and a 75-um 
guidewire probe into the ACA channel. (E) Percentage of probes with different 
guidewire widths injected into MCA and ACA channels (n = 5 for 25 um, n = 11 for 
75 um) and rat vessels (n = 18 for 25 pm, n = 8 for 75 um). (F) Inferior and 

(G) sagittal views of dissected rat brains with MEV probes in the MCA. (H) Inferior 
view of a rat brain with a probe in the ACA. Scale bars in all panels are 1 mm. 
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Fig. 3. In vivo endovascular recording. (A) Sagittal view of a rat brain showing 
an MEV probe in the MCA and (B) the corresponding acute in vivo 16-channel 
unfiltered recording showing local field potential oscillations under ketamine/xylazine 
anesthesia. The relative positions of 16 Pt electrodes are marked by red spots in 

the schematic in (B) and higher-numbered channels were implanted deeper in the 
MCA. (C) (Top) Penicillin-induced seizures recorded by a representative channel from 


a probe in MCA for 2 min before and 20 min after penicil 


spectrogram before and after penicillin administration. The color map shows the 


selectively achieved in the cortex and olfactory 
bulb. The flexible probe/vessel wall/brain inter- 
face exhibits minimal inflammatory response 
and long-term stability. 


Micro-endovascular probes and their 
bending properties 


We first consider the design of MEV probes 
and how they meet the key constraints for im- 


plantation into the narrow and tortuous sub- 
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in administration. (Middle) 


100-micrometer-scale brain vascular network 
without causing damage. Inspired by minimally 
invasive catheter-based injection procedures 
(21), we design polymer-based ultraflexible 
MEV probes that can be loaded into and in- 
jected from flexible microcatheters. After in- 
serting the microcatheter to the target vessel, 
the MEV probes are injected into deeper vas- 
culature by saline flow in the microcatheter. 
The microcatheter is then retracted, leaving the 
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normalized power levels of the recording trace. (Bottom) Zoomed-in views showing the 
evolution of seizure spikes at three different time points. (i to iii) Changes in spike 
amplitude over time. (D) Number of spikes with amplitude over 1 mV per min recorded 
by the probe in MCA in (C). 1034 spikes were recorded 20 min post administration. 
(E to F) Recording and analysis of penicillin-induced seizures from a probe in ACA. 

(i to iii) show the recording trace before, between, and during burst firing activity. 
496 spikes were recorded in 20 min post administration in (F). Black arrows denote 
the time point of penicillin administration. 


probes in place. This procedure imposes sev- 
eral requirements on the probe design: First, 
the probes should be able to be loaded into and 
move smoothly in the microcatheter. Second, 
the longitudinal bending stiffness of the MEV 
probes must be sufficient to ensure smooth in- 
jection into deeper vasculature beyond the reach 
of microcatheters without buckling, but also 
low enough to follow the tortuous vessels and 
prevent mechanical damage to vessel walls. 
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Fig. 4. Single unit recording. (A to B) Inferior (left) and anterior (right) views 
of a rat brain with an MEV probe in ACA in (A) the olfactory bulb and (B) the 
corresponding multichannel recording from the same probe after 250 to 

6000 Hz band-pass filtering to show single unit burst activity. Higher-numbered 
channels were implanted deeper in the ACA. Scale bars in (A) are 5 mm. 

(C) (Top) sorted spikes assigned to different neurons from the channels with 
single unit activity. Each distinct color in the sorted spikes represents a unique 


Third, to optimize recording quality, electrodes 
on the MEV probes should closely attach to 
the inner vessel walls (78). 

The MEV probe (Fig. 1, B and C, and figs. 
S1 and S2) shows the ultraflexible mesh-like 
device region on the left, tapering into the 
flexible stem in the middle, and then to the 
input/output (I/O) region on the right. In 
the device region, metal electrodes are em- 
bedded in a SU-8 polymer-based mesh-like 
substrate and are delivered to the targeted 
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brain region, whereas the I/O region remains 
exteriorized for subsequent connection and 
measurement. The injection depth can be de- 
termined by tracking the location of the I/O 
region during implantation. 

Several aspects of the MEV probe structure 
have been considered. First, the transverse 
ribbons in the device region enable the 900-um- 
wide probe to be rolled up inside the micro- 
catheter, a flexible tube with an inner diameter 
of 200 um and an outer diameter of 350 um 


Rh © fF 
So Oo 8 


Spike number per min 


: 6 8 1012141618 
min 
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identified neuron. (Bottom) spike amplitude (with +1 s.d.) and the number of 
single unit spikes recorded in 20 min. (D) Periodic single-unit spikes recorded 
by a probe in ACA. (Top) the unfiltered recording trace. (Bottom) after 250 

to 6000 Hz band-pass filtering. (E) Single-unit spikes sorted from the data shown 
in (D). (F) Changes in firing frequency with different isoflurane concentration in 
which higher concentration (2.0%) decreased and eventually suppressed firing 
whereas lower concentration (0.5%) temporarily recovered firing. 


(22). Second, an approximately 10-1m-thick 
guidewire made of SU-8 polymer (Fig. 1B) was 
designed to optimize longitudinal bending stiff- 
ness for smooth injection into tortuous vessel 
branches without buckling. The guidewire de- 
termines the overall bending stiffness of the 
MEV probe; it was placed over the middle 
ribbon (other longitudinal ribbons are approx- 
imately 0.8 um thick) in the device region, the 
stem, and the I/O regions (fig. S2). The bend- 
ing stiffness of the MEV probes is comparable 
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Fig. 5. Chronic histology. (A) Optical image of an IgG-stained coronal brain slice 
28 days postimplantation in MCA. Scale bar is 1 mm. Inset shows the location of 
the slice. (B) Zoomed-in views of the contralateral and ipsilateral MCAs from 
Hematoxylin and Eosin (H&E)-stained slices from the regions highlighted by the blue 
boxes in (A). The yellow arrow highlights the probe embedded in the vessel wall. 
Scale bar is 100 um. Note that the vessels were cut at an angle to the central axis; the 
vessel diameter is the minor axis of the oval cross-section. (C) MCA vessel wall 


to free-standing 100-um blood vessels but be- 
cause the blood vessels are embedded in brain 
tissue, the vessels will have a larger bending 
stiffness than the MEV probes (Supplemen- 
tary Text). The probes were manufactured, 
released from substrates, and loaded into 
microcatheters (fig. S3) (22-24). The probes 
can advance smoothly within the flexible 
microcatheter and will remain extended with- 
out buckling even when the microcatheter is 
bent (movie S1). Altering the guidewire width 
can modify the deflection of the probe tip 
under a given applied force (Fig. 1D and fig. S4; 
Supplementary Text). Third, the mesh-like struc- 
ture relaxes and unfolds after injection (22) 
(fig. S3B and movie S2), allowing the electrodes 
to adhere against the inner vessel walls, in a 
similar manner to vascular stent deployment. 
The rest of the probe is subsequently injected 
from the microcatheter (fig. S3C). In the de- 
vice region, 16 platinum electrodes (80 um 
each) are distributed over a length of 1 cm, 
allowing probing of multiple brain regions. 


Branch-selective implantation 


To demonstrate endovascular implantation, 
we exploited the established surgical procedure 
used for rodent stroke models, middle cerebral 
artery occlusion (MCAO) (25), without intro- 
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contralateral MCA 


ipsilateral cortex 


ducing an occlusion. The common carotid artery 
(CCA) bifurcates into the external carotid artery 
(ECA) and ICA in the neck. The ICA segment 
in the brain branches to form two major cere- 
bral arteries, the MCA and the anterior cere- 
bral artery (ACA), which overlay the cortex 
and the olfactory bulb, respectively. To per- 
form MCAO, a filament is inserted into the 
ECA and threaded through the ICA until it 
occludes the MCA/ACA bifurcation. Similarly, 
a microcatheter loaded with an MEV probe 
and attached to a syringe can be inserted into 
the MCA/ACA bifurcation and the probe sub- 
sequently injected (Fig. 2A and fig. S5). Whereas 
the microcatheter can only reach the MCA/ACA 
bifurcation, the saline flow in the microcatheter 
allows the probe to be carried much deeper into 
either MCA or ACA branches. Following injec- 
tion, the microcatheter is retracted, leaving the 
MEV probe in the MCA or ACA. 

We achieve branch-selective implantation 
by tuning mechanical properties of the probe. 
Because the MCA and the ACA branches are 
at different angles from the ICA, we sought 
to determine whether changing the bending 
angle and bending stiffness of the guidewire 
would enable selective targeting of MCA and 
ACA branches. The angles of MCA and ACA 
from ICA (2 = 13, each) were measured to be 
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thickness with +1 s.d. measured from similar H&E-stained images as shown in (B). 
(D) Fluorescence microscopy images from the regions highlighted by the red boxes in 
(A). The brain slices were stained for Ibal (green) and GFAP (red), and DAPI (blue). 
Scale bar is 100 um. (E) Number of microglia (Ibal) and astrocytes (GFAP) within 
600 um x 450 um regions, with +1 s.d. Results in (C) and (E) were measured from 
15 brain slices from three rats (n = 5 for each rat, brain slices from the same rat are 
600 um apart.) n.s., nonsignificant; unpaired two-tailed t-test. 


108 + 7° and 153 + 12° respectively (Fig. 2B). 
To match the bending angles of MCA and 
ACA, we fabricated MEV probes with guide- 
wire widths of 25 um and 75 um, respectively. 
As a result of the residual stress in the SU-8 
film generated during processing (26, 27), the 
device region of the free-standing 25-um and 
75-um guidewire probes formed radii of cur- 
vature of 8.5 + 1.3mm and 18.5 + 4.2 mm in 
water, respectively (7 = 20) (Fig. 2C and fig. 
S6), which correspond to effective bending 
angles of 119° and 152° To test the branch 
selectivity in vitro, we fabricated polydime- 
thylsiloxane microfluidic channels matching 
the diameters and angles of the proximal seg- 
ments of the ICA, MCA, and ACA (Fig. 2D). 
Microcatheters carrying 25-um and 75-um 
guidewire MEV probes were inserted into 
the ICA channel and the probes were man- 
ually injected into either the MCA or the ACA 
channels. 

An analysis of the branch selectivity (Fig. 2E) 
in microfluidic channels shows that 80% (n = 5) 
of the probes with 25-um guidewires were 
injected into the MCA channel whereas 100% 
(n = 11) of the probes with 75-um guidewires 
were injected into the ACA channel. Similar 
branch selectivity was observed for the MEV 
probes implanted in rat brains in vivo: 72% 
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(n = 18) of the probes with 25-um guidewires 
were implanted into the MCAs whereas 88% 
(n = 8) of the probes with 75-um guidewires 
were implanted into the ACAs. The inferior/ 
horizontal and sagittal views of the perfused 
and dissected rat brains confirmed the im- 
plantation into the MCA (Fig. 2, F and G) and 
the ACA (Fig. 2H) branches, where the typical 
injection depth exceeds 1 cm into the distal 
segments past the microcatheter opening at 
the MCA/ACA bifurcation. Magnified views of 
the brains (Fig. 2, F to H) show that the MEV 
probes maintain their extended shapes within 
the targeted vessels without buckling, and the 
individual electrodes are easily identifiable. 
In addition, the ultraflexible device region is 
rolled up within the vessel, squeezing the 
80-um electrodes to the midline, indicating 
that the inner diameters of the targeted vessels 
are on the 100-micrometer scale. 


In vivo endovascular recording 


With probes implanted into the MCA (Fig. 3A, 
fig. S7) and the ACA of anesthetized rats (fig. 
S8) we demonstrated the ability of the MEV 
probes to record brain activity. Representative 
multichannel recordings (Fig. 3B and fig. 
S8) yielded well-defined signals across all 
16 channels. The fluctuation amplitude (200 uV 
to 2 mV) and the dominant frequency (<2 Hz) 
recorded are characteristic of the delta wave 
local field potentials under ketamine/xylazine 
anesthesia (28). 

Following this, we investigated whether the 
MEV probes implanted in the MCAs covering 
the cortex and the ACAs covering the olfactory 
bulb reveal different firing properties of differ- 
ent brain regions in neurological disease mod- 
els. In anesthetized rats, we created epilepsy 
models by inducing local seizures with intra- 
cortical penicillin injection (29, 30) into the 
right hemisphere where the probes were im- 
planted (Fig. 3, C and E). Electrophysiological 
recordings by a representative channel on a 
probe implanted in the MCA (Fig. 3C) show 
seizure activity characterized by bilateral spikes 
and spike-wave complexes. Following penicillin 
administration, seizures began almost immedi- 
ately and reached a constant level after about 
4 min (Fig. 3, C and D). The mean spike fre- 
quency and amplitude from 5 to 20 min were 
57 + 6 per min and 1.95 + 0.62 mV, respec- 
tively. Simultaneous recording from all 16 chan- 
nels of this probe identified spikes only in three 
adjacent channels (fig. S9), demonstrating the 
ability of the MEV probes to locate and track 
the seizure foci. Recordings from MEV probes 
in MCAs in two other rats recorded similar 
firing patterns (fig. S10). In comparison, rep- 
resentative recordings from a probe implanted 
in the ACA showed a latent phase lasting about 
4 min after penicillin administration (Fig. 3, E 
and F). After seizure onset, a burst-suppression 
pattern was observed. The burst firing activity 
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around 17 min consists of periodic field po- 
tential waves (frequency: 0.88 Hz, peak width: 
95.8 + 20.5 ms, amplitude: 2.89 + 0.24 mV) 
followed by a train of fast and narrow spikes 
(frequency: 14.65 Hz, peak width: 1.9 + 1.0 ms, 
amplitude: 0.71 + 0.24 mV) (Fig. 3E). Seizure 
recordings from ACAs in two other rats showed 
similar firing patterns (fig. S11). 

Comparing the data recorded in the MCAs 
and the ACAs reveals several important char- 
acteristics. First, the seizure spikes spread to 
the cortex (MCA territory) faster than the 
olfactory bulb (ACA territory), suggesting that 
penicillin-induced seizures start as focal activ- 
ity, then propagate to other brain regions, 
inducing generalized seizures (37). Second, the 
seizure activity in the cortex regions was con- 
sistent with previous reports of rats treated 
with intracortical penicillin administration 
(29, 30). Third, the burst-suppression pattern 
and the periodic field potential waves followed 
by spike trains recorded from the probes in 
ACA were similar to those observed from rat 
olfactory bulb cells (32, 33). 


Single unit activity recording 


We examine the possibility of recording single- 
unit spikes across the blood vessel wall, which 
has not been achieved using previous endo- 
vascular probes as a result of their inability to 
target micrometer-scale vasculature with thin 
vessel walls. Typical depth electrodes detect 
spikes from neurons ~130 um away (34). A 
100-um artery has a vessel wall thickness of 
~10 to 20 um (35), which is well within the 
detection range. From the MEV probes injected 
deeply into the ACA segment overlaying the 
olfactory bulb under isoflurane anesthesia 
(Fig. 4A), we repeatedly recorded discontinuous, 
prolonged spontaneous bursts of action poten- 
tials with spindle shapes lasting for tens of 
seconds (Fig. 4B), characteristic of olfactory 
bulb mitral cells of anesthetized rats (32, 36). 
The recording trace of the channels with the 
largest amplitude activity exhibits single-unit 
spikes. Single neuron activity was tracked by 
clustering the sorted spikes with principal com- 
ponent analysis from the channels with single 
unit activity (23), with three neurons identified 
from Ch. #1, and two neurons identified from 
Ch. #2 to Ch. #5 (Fig. 4C, top). All neurons 
exhibit higher spike amplitude and number 
of spikes in Ch. #1 and decay from Ch. #2 
through Ch. #5, indicating that Ch. #1 is the 
closest to the spiking neurons (Fig. 4C, bot- 
tom). Similar recordings were reproducibly 
observed from other MEV probes in ACAs 
(fig. S12). In addition, from another MEV 
probe injected into the ACA under isoflurane 
anesthesia, we have observed burst spikes 
nested onto the respiratory rhythm in the field 
potential (37, 38) (Fig. 4D). The sharp down- 
ward spikes (Fig. 4E) showed a uniform po- 
tential waveform with average duration ~1 ms 


and peak-to-peak amplitude of ~60 pV, char- 
acteristic of single-unit action potentials (23). 
We tested modulating neuron firing by raising 
isoflurane concentration (Fig. 4F) from 1.5% 
to 2%, which suppressed spiking. Decreasing 
to 0.5% restored spiking, which eventually dis- 
appeared, likely attributable to the prolonged 
exposure to high-concentration isoflurane (39). 


Chronic histology 


We examined both the short-term and chronic 
effects of MEV probes. For short-term effects 
laser doppler flowmetry was used to monitor 
cerebral blood flow before and immediately 
after probe injection. Representative laser 
doppler flowmetry traces of probe implanta- 
tion in the MCA and the ACA (fig. S13) show 
that probe implantation does not have a sub- 
stantial effect on the cerebral blood flow. Im- 
mediately after MCA implantation, the blood 
flow fluctuated between 60 and 140% of the 
baseline, but the average value remained ~ 100%. 
The blood flow throughout the implantation 
processes remained much higher than the 
laser doppler flowmetry level required to in- 
duce stroke, which should be stabilized below 
30% of the baseline for 90 min (40). On days 1, 
3, 7, 14, and 28 after MEV probe implantation, 
we conducted behavior tests using neurolog- 
ical severity scores (41). By day 3, all rats had 
reached a score of 0, indicating no neurologic 
deficit. In addition, we performed experiments 
showing that MEV probes are not able to de- 
form or penetrate the vessel walls (movie S3 
and Supplementary Text). 

We examined how chronic MEV probe im- 
plantation affects blood vessel walls and brain 
tissue. A histology study of the MCA was per- 
formed 28 days after implantation as neointi- 
mal formation reaches its maximum 28 days 
after rat artery stenting, thickening the walls 
substantially (42, 43), and the rat MCAO mod- 
els start to recover from ischemic damage after 
28 days (44). First, the blood-brain barrier 
(BBB) integrity was evaluated with immuno- 
globulin G (IgG) staining (45). IgG-stained brain 
slices 28 days after implantation in MCA 
(Fig. 5A) showed no increase in IgG protein in 
the ipsilateral hemisphere when compared with 
the contralateral hemisphere, which confirms 
that the integrity of the BBB was well-preserved. 
Second, cross sections of the contralateral and 
ipsilateral MCAs were examined (Fig. 5B and 
fig. S14), showing that the targeted vessel di- 
ameter is less than 100 um and that the probe 
ribbons were embedded in the vessel walls. 
The vessel wall thicknesses of the contralateral 
and ipsilateral hemispheres are 18.9 + 3.2 um 
and 17.9 + 2.0 um measured from 15 brain 
slices from three rats (Fig. 5C). On the ipsi- 
lateral side, no increase in vessel wall thickness 
was observed, confirming that MEV implants 
did not cause neointimal formation, which is 
commonly observed following vascular stenting 
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(42, 43). Third, the lateral cerebral cortex within 
the MCA territory was evaluated with fluo- 
rescence microscopy (Fig. 5D). Microglia, 
astrocytes, and nuclei were identified using 
antibodies against Ibal (green), glial fibrillary 
acidic protein (GFAP, red), and DAPI (blue). 
The numbers of microglia counted from both 
hemispheres from 15 brain slices from three 
rats are 24 + 6 on the contralateral side and 
24 + 5 on the ipsilateral side, while the num- 
bers of astrocytes are 53 + 17 on the contra- 
lateral side and 56 + 14 on the ipsilateral side. 
There was no increase of microglia and astro- 
cytes on the ipsilateral side, indicating that 
endovascular implantation does not induce a 
significant immune response. Moreover, short- 
term histology studies conducted 3 days post- 
implantation in the MCA (fig. S15) and the 
ACA (fig. S16), and chronic histology studies of 
the ICA segment in contact with the micro- 
catheter (fig. S17) also showed no increase in 
vessel wall thickness or number of immune 
cells. Our chronic histology results showed a 
substantial improvement from previously re- 
ported stiff endovascular probes, which could 
induce chronic venous thrombosis and occlusion 
(18). These observations not only demonstrate 
the minimal invasiveness of the MEV probes, 
but also indicate major advantages in chronic 
electrophysiology recording, as the accumula- 
tion of glial scar tissue near the brain elec- 
trodes is known to cause electrode failure in 
clinically relevant chronic settings (46). 


Outlook 


This study demonstrated ultraflexible probes 
that can be delivered into sub-100-micrometer- 
scale vessels in rodents without open-skull 
surgery. The probes can be selectively im- 
planted into small vessel branches that are not 
accessible to any available microcatheters, 
thus enabling neural recording across vessel 
walls at single cell resolution. Histology studies 
of the probe-tissue interface showed minimal 
immune response and long-term stability. 
The cerebrovasculature has a hierarchical 
structure ranging from large superficial corti- 
cal vessels to the microvasculature and capil- 
lary beds in the cortex. In the rat brain, 4 to 5% 
of vessels have a diameter of larger than or 
approximately equal to 100 um in diameter 
(47), which could be targeted by the reported 
MEV probes. It should also be possible to tar- 
get smaller diameter vessels by reducing the 
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width and/or transverse bending stiffness of 
the probes. In comparison, endovascular probes 
reported for humans and sheep have only been 
able to target the largest vessel, the 2.4-mm- 
diameter sagittal sinus (J8). 

We believe that this platform technology 
could be extended to the detection and treat- 
ment of many neurological diseases as a re- 
search tool and could serve as the foundation 
for clinical translation of minimally invasive 
neuroelectronic interfaces. 
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Hexasome-INO80 complex reveals structural basis of 
noncanonical nucleosome remodeling 


Min Zhang’, Anna Jungblut'7}, Franziska Kunert*+, Luis Hauptmann’, Thomas Hoffmann’, 
Olga Kolesnikova’, Felix Metzner®, Manuela Moldt’, Felix Weis't, Frank DiMaio‘, 


Karl-Peter Hopfner®, Sebastian Eustermann** 


Loss of H2A-H2B histone dimers is a hallmark of actively transcribed genes, but how the cellular machinery 
functions in the context of noncanonical nucleosomal particles remains largely elusive. In this work, we report 
the structural mechanism for adenosine 5'-triphosphate—dependent chromatin remodeling of hexasomes by 
the INO80 complex. We show how INO80 recognizes noncanonical DNA and histone features of hexasomes 
that emerge from the loss of HZA—-H2B. A large structural rearrangement switches the catalytic core of INO80 
into a distinct, spin-rotated mode of remodeling while its nuclear actin module remains tethered to long 
stretches of unwrapped linker DNA. Direct sensing of an exposed H3-H4 histone interface activates INO80, 
independently of the H2A—H2B acidic patch. Our findings reveal how the loss of H2A—H2B grants remodelers 
access to a different, yet unexplored layer of energy-driven chromatin regulation. 


he packing of genomic DNA by histones 

into nucleosomes is essential for the regu- 

lation and maintenance of (epi)genetic 

information in eukaryotes. A canonical 

nucleosome core particle (NCP) is com- 
posed of ~147 DNA base pairs (bp) wrapped 
around a histone octamer that contains one 
histone 3 (H3)-histone 4 (H4) tetramer and two 
histone 2A (H2A)-histone 2B (H2B) dimers (1). 
A wealth of structural and mechanistic studies 
characterized the role of such canonical nucleo- 
somes (2, 3). However, relatively little is known 
about how the cellular machinery functions 
in the context of noncanonical nucleosomal 
particles. 

Noncanonical nucleosomal particles can 
differ from their canonical counterparts in their 
histone stoichiometry, structural plasticity, and 
mode of DNA wrapping (4). Growing evidence 
links noncanonical nucleosomes to energy- 
driven processes such as transcription and 
adenosine 5’-triphosphate (ATP)-dependent 
chromatin remodeling (5-9). The hexasome 
lacks one copy of an H2A-H2B dimer (JO), 
which can result in unwrapping of DNA (J7, 72), 
interplay with histone chaperones (13, 14), and 
collision with canonical NCPs (15). The hexa- 
some was discovered in complex with RNA poly- 
merase II in the context of active transcription 
(10). Recent in vivo mapping (6) in conjunction 
with structural studies (17, 18) suggests that RNA 
polymerase II activity causes the loss of H2A- 
H2B dimer. However, the molecular mechanisms 
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by which noncanonical nucleosomes might 
subsequently function as substrates for ATP- 
dependent chromatin remodelers and other 
enzymes in shaping the epigenetic landscape 
of chromatin remained largely elusive. 

To directly address this knowledge gap, we 
sought to understand the structural mecha- 
nism by which multi-subunit ATP-dependent 
chromatin remodelers recognize and remodel 
hexasomal substrates. Hexasomes were recently 
identified as preferred substrates for the 15- 
subunit INO80 chromatin remodeler from 
Saccharomyces cerevisiae (S.c.) (9). Hexasomes 
are a hallmark of actively transcribed genes 
(5, 16), for which INO80 plays a central role in 
the translocation and positioning of +1 nucleo- 
somes adjacent to gene promoters and in the 
subsequent formation of genic nucleosome 
arrays (19-22). This raises fundamental ques- 
tions about the underlying mechanism. The 
activity of INO80 for nucleosomes relies on an 
intact H2A-H2B acidic patch at the proximal 
site in respect to linker DNA (9, 23, 24). More 
generally, the H2A-H2B acidic patch has been 
established as a pivotal binding platform for 
members of all remodeler families and a large 
number of other chromatin factors (3, 25). Yet, 
the hexasome lacks an H2A-H2B dimer and 
might therefore differ also in other fundamen- 
tal features, which underscores the need for 
understanding molecular mechanisms in the 
context of noncanonical nucleosomal particles. 


Cryo-EM structure of the INO80-hexasome 
complex in an active state 


We reconstituted a 1.1-MDa complex that 
consists of an evolutionarily conserved 11- 
subunit INO80 from Chaetomium thermophilum 
(C.t.) (23) bound to a hexasomal substrate, 
which lacks an H2A-H2B dimer on the pro- 
ximal face in respect to 80 bp of linker DNA 
(Fig. 1, A to C, and fig. $1). Upon the addition of 
ATP, the complex displayed a robust hexasome 


translocation activity similar to that of the 


t. 


hec 
updz 


INO80 complex, which suggests a conse! 


mode of action (Fig. 1C and fig. S2). The acti- 
vity does not require the species-specific S.c. 
Nhp10 module, which is not present in the 
recombinant Ct. INO80 complex (23). Notably, 
recombinant C.t. and S.c. INO80 showed a 
slightly different biochemical behavior compared 
with the previously reported endogenously 
purified complex (9). Competition assays showed 
that nucleosomes are preferred substrates to 
hexasomes for both recombinant Ct. and S.c. 
INO80 complexes when core particles with 
comparable DNA linker lengths were tested 
(80 bp versus 40 bp adjacent to 601 posi- 
tioning sequence) (fig. S3A). However, the loss 
of H2A-H2B dimer converted an inactive nucleo- 
some with 20-bp short linker DNA (26) into an 
active substrate for INO80 (fig. S3B). To under- 
stand the structural basis of this activity, we 
captured the active ATP state of the reconsti- 


tuted INO80-hexasome complex by addition ° 


of adenosine 5’-diphosphate (ADP)eAIF, with- 
out the need for cross-linking and determined 
the structure of the complex by single-particle 
cryo-electron microscopy (cryo-EM) analysis 
(figs. S4 to S8 and table S1). The recombinant 
C.t. INO80 is composed of a catalytic core 
module [Ino80 adenosine triphosphatase 
(ATPase), Arp5, Ies6, Ies2, Rvb1, and Rvb2] as 
well as a nuclear actin-containing Arp8 mod- 
ule (Arp8, actin, Arp4, Ies4, and Tafl4) (23). 
The three-dimensional (3D) classification and 
focused refinement procedures yielded two 
composite reconstructions of the hexasome 
bound to the INO80 core module with reso- 
lutions of local reconstructions ranging from 
2.9 to 7.6 A (figs. S4. to S7 and S9 and table S1), 
as well as a reconstruction of the linker DNA- 
bound Arp8 module at a resolution of 3.7 A 
(figs. S4 and 87 and table S1). The anisotropy 
of the cryo-EM density for the Arp8 module 
restricted its structural analysis to rigid-body 
fitting (figs. S7 and S10 and table S1). Most of the 
cryo-EM densities for the INO80 core module 
in complex with the hexasome are of sufficient 
resolution and quality for atomic modeling 
(figs. S5 to S8 and table S1). 


INO80 remodels the hexasome in a 
spin-rotated mode 


The cryo-EM structure of the hexasome in 
complex with INO80 reveals an unanticipated 
architecture that is markedly distinct from 
known nucleosome-remodeler complexes (23, 27). 
The INO80 core module engages the proximal 
face at the DNA entry site of the hexasome, 
even though it lacks the H2A-H2B dimer. The 
intact H2A-H2B acidic patch remains, by con- 
trast, solvent exposed at the distal face of the 
hexasome and is not used as a binding plat- 
form (Fig. 1, A and B). The overall complex shows 
alarge structural rearrangement in comparison 
to its canonical form (23, 27). The INO80 core 
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Fig. 1. Cryo-EM structure of INO80 core module in complex with hexasome. 


(A) Comparison of the hexasome from our structure with a canonical 
nucleosome (PDB ID: 1AQI). The hexasome was reconstituted from the Widom 
601 sequence with linker DNA on one side (0-601-80). The disk face near 

the linker DNA is denoted as the “proximal face,” which lacks the H2A-H2B 


dimer. (B) Cryo-EM structure of the INO80 core module in complex with hexasome. 
(C) Hexasome and nucleosome translocation by C.t. INO80 and S.c. INO80 (n = 3). 


Fraction of centrally positioned hexasome is indicated as remodeled hexasome in 
percent. (D) Ino80 ATPase motor and Arp5/les6 grip on the nucleosome (PDB 
ID: 8AV6) and hexasome (this study). INO80 captures an unwrapped state of the 


INO80-hexasome 


canonical nucleosome 


hexasomes. The Ino80 ATPase motor binds the entry site at SHL-3, whereas 

the Arp5/les6 grip binds near the dyad and recognizes the proximal face through its 
Arp9 grappler element. B-form DNA (gold) is modeled to illustrate the unwrapped 
hexasomal linker DNA. A red dot indicates SHL-6 bound by Ino80 ATPase motor on 
canonical nucleosome and the same base pair in the context of the hexasome 
complex. (E) Upper panel: INO80 undergoes a 145° “spin rotation” in the 

context of hexasome recognition. Lower panel: The ATPase motor binding 

sites of different remodelers on mononucleosome (fig. S13). The directionality 

of superhelical locations (SHL “—" or “+") is defined by the DNA translocation 
direction: Those proximal to the entry DNA are defined as “—", otherwise as “+’. 


module undergoes a spin rotation by ~145° in 
respect to the hexasomal substrate (Fig. 1, D and 
E). Within this configuration, INO80 maintains 
a conserved motor-stator-grip architecture, com- 
prising the Ino80 Snf2-type ATPase motor, the 
heterohexameric Rvb1/Rvb2 stator assembly, 
and the Arp5/Ies6 grip subunits (Fig. 1B). 
Previous cryo-EM structures show how INO80 
clasps nucleosomes in between the ATPase 
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motor domain at the DNA entry site at super- 
helical location —6 (SHL-6) and the Arp5/Ies6 
grip on the opposite site at SHL-2/-3 (23, 27). 
However, the hexasome in complex with the 
INO80 core module exhibits considerably re- 
duced histone-DNA interactions that involve 
only 101 bp (Fig. 1A) instead of the canonical 
147 bp of DNA wrapped around a histone 


octamer. The unwrapping of 46-bp DNA 


creates a new location of the DNA entry site 
for the hexasome, which we unambiguously 
identified by assigning the histone and DNA 
sequence register in our cryo-EM density (fig. 
S8, D to F). The ATPase motor domain binds 
to this DNA entry site at SHL-3 and further 
unwraps DNA from the histone core by at least 
8 bp. The ATPase motor domain is captured in 
an active ATP-hydrolysis transition state and 
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Fig. 2. Recognition of noncanonical DNA and histone features leads to 
INO80 activation. (A) (Left) Multivalent interactions between INO80 core 
module and the proximal face of the hexasome. (Right) Bottom view of the 
hexasome proximal face. The missing H3' aN, the extended H4' a3 helix (in orange, 
labeled as H4' B-o switch), and the H3a1L1 elbow of the hexasome are indicated. 
The Arpd foot helix is shown in teal. (B) Front views of Arp5 foot recognition of 
the H2A-H2B acidic patch on the nucleosome and of the exposed H3-H4 
interface on the hexasome. Proximal H2Ba1L1 elbow interaction on the 
nucleosome and H3a1L1 elbow interaction on the hexasome are shown in 
close-up views. (€) Bottom views of Arp5 foot recognition of the proximal HZA-H2B 
acidic patch on the nucleosome and of the exposed H3-H4 interface on the 
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hexasome. Arp5 heel and foot recognition of the proximal H2A-H2B acidic patch 
on the nucleosome and of the exposed H3-H4 interface on the hexasome are 
shown in close-up views. Key residues on heel and foot, which are targeted for 
mutagenesis (Arp5'*! and Arp5'’'*¥°t) are shown as spheres. Histone 
counterparts in both structures (H3a1 versus H2Bol and H4a2 versus H2Aa2) 
are indicated. (D) Nucleosome translocation activity assays (n = 3) comparing 
wild-type (WT) INO80-hexasome complex (INO80-0H80), H2A-H2B acidic 
patch mutant (INO80-OH804"™*), and Arp5 mutants (Arp5°®, Arp5hee!, and 
Arpstee'*fecty) (E) Hexasome translocation activity assays (n = 3) comparing WT 
INO&0-nucleosome complex (INO80-ON80), H2A-H2B acidic patch mutant 
(INO80-ON80*°™*), and Arp5 mutants (Arp5°®°, Arp5?e!, and Arpsnect*focty. 
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shows clear density for ADP*AIF, at the active 
site (fig. S8F). The Arp5/Ies6 grip is located on 
the opposite site of the hexasome near the dyad 
at SHLO/+1 (Fig. 1D and fig. S9C). This places 
the multiarmed grappler element of Arp5 at the 
proximal face where it captures DNA (around 
SHL+3) that becomes exposed upon unwrap- 
ping and recognizes other noncanonical fea- 
tures of the hexasome core particle (Fig. 1D). At 
the dyad of the hexasome, we observed two 
alternative binding configurations of the Arp5/ 
les6 grip that are closely related to each other 


by a hinge-like motion of core complex (state 
land state 2). The DNA binding domain (DBD) 
of Arp5 can switch between the major and 
minor groove binding modes at SHLO and 
SHL+1 (fig. $9). 

At SHL-3 of the hexasome, the Snf2-type 
ATPase motor adopts a conformation similar 
to that on the canonical nucleosomes at SHL-6 
(23, 27), which suggests that individual DNA 
translocation steps can proceed through a 
common mechanism. However, compared with 
the canonical complex, the ATPase motor and 
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Fig. 3. The Arp8 module flexibly tethers INO80 to long stretches of linker DNA. (A) Integrated 
structural model of the overall INO80-hexasome architecture on the basis of cryo-EM reconstructions and 
cryo-EM single-molecule level mapping (left). The 3D reconstructions (middle) and 2D class averages (right) 
indicate substantial motion between the INO80 core module and the Arp8 module. A 20-bp B-form DNA 
(white) was modeled on the basis of the center-to-center distance (170 A) derived from cryo-EM single- 
molecule level mapping of the two modules (B). The DNA-bound Arp8 module (PDB ID: 8A5Q) is flexibly 
tethered to core module at the hexasome core particle by means of a disordered linker region of the Ino80 
protein (dotted red). Gray arrows indicate possible motions of the complex. (B) (Left) Schematic illustrating 
cryo-EM single-molecule level mapping. Particles of the INO80 core module and the Arp8 module were 
analyzed separately. Center-of-mass coordinates (green and white crosses) were used to calculate the 


individual nearest-neighbor distances between modules. ( 


iddle) The distances of particle pairs are plotted 


as a histogram (see fig. S17 for distances >300 A). A prominent peak indicates an average distance of 
170 A between core and Arp8 module particles of the same complex (marked in red). Larger distances result 


most likely from particles belonging to different complexes 


marked in black). (Right) The 2D class averages 


visualizing the entire complex were derived from particle pairs identified within the 170-A peak of the single- 
molecule mapping distribution (see method). Both INO80 core module-hexasome and Arp8 module—-DNA, as 


well as the connecting nucleic acid, are discernible. 
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the Arp5/Ies6 grip operate on the hexasome in 
a substantially different histone and DNA se- 
quence context. The complex is poised to 
translocate DNA along the H3-H4 tetramer 
toward the Arp5/Tes6 grip near the dyad, whereas 
in the context of nucleosomes, the DNA is 
translocated along the H2A-H2B interface 
toward the Arp5/Ies6 grip around SHL-3. We 
probed these different modes of operation bio- 
chemically by introducing DNA single-stranded 
gaps and nicks at different superhelical loca- 
tions (figs. S11 and S12). A single-stranded DNA 
break downstream of SHL-6, which locates 
between the ATPase motor and the Arp5/Tes6 
grip, severely compromised sliding activity on 
a nucleosome (fig. S11, A and B), as reported 
previously (28). However, in the context of a 
hexasome, such a DNA break downstream of 
SHL-6 showed little effect on the sliding ac- 
tivity of INO80 (fig. S11, A and C). This is con- 


sistent with the spin-rotated configuration of | 


INO80 on the hexasome because the DNA break 
at SHL-6 resides ~25 bp upstream of the 
ATPase motor within the unwrapped linker 
DNA (fig. SILA). Previous studies suggested 
that remodelers might be restricted to a sin- 
gular mode of nucleosome engagement (29), 
e.g., with the ATPase motor located at SHL+2 
or SHL-6 (Fig. 1E and fig. S13). The near- 
atomic structure of the INO80 core module 
bound to a hexasome reveals an unantici- 
pated behavior: Spin rotation switches INO80 
into alternative modes of operation when it 
encounters canonical or noncanonical nucleo- 
somal substrates. 


An H2A-H2B acidic patch-independent 
mechanism of remodeling 


To understand the core mechanism of remodel- 
ing hexasomes further, we probed the molecu- 
lar determinants of INO80’s activity at the 
proximal face of the hexasome. The INO80 
core module engages the hexasome through 
extensive, multivalent interactions with distinct 
DNA and histone features (Fig. 2A), which 
suggests a specific recognition mode that 
depends on loss of the H2A-H2B dimer. Struc- 
tural alignment to the canonical INO80 com- 
plex (23, 27) shows that the position of the 
missing H2A-H2B dimer is replaced, through 
spin rotation, by H3-H4 in the context of the 
hexameric histone core (Fig. 2, B and C). The 
Rvb1/Rvb2 stator element, as well as the Ies6 grip 
subunit, recognizes this structural rearrangement 
by contacting the H3 aiIL1 elbow as well as the 
H3 a3 helix, respectively (Fig. 2, A and B). 
Moreover, the loss of the H2A docking do- 
main destabilizes the H3’ aN helix (Fig. 2A 
and fig. S14), which has been previously shown 
to facilitate DNA unwrapping (72). The H4! C- 
terminal tail switches from a canonical B-strand 
configuration stabilized by H2A into an ex- 
tended H4’ 03 helix that fills the central cavity 
of the hexasome (Fig. 2A and figs. S8J and S14), 
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Fig. 4. Structural model of non- 6 O 0-0-0. e 5 o-O-. 
canonical nucleosome remodeling — | “a —> | 
* «gs transcription ~18bp transcription ~64bp 
at sites of transcription. Tran- 
scription elongation leads to exposed H3-H4 |NO80 


oss of H2A-H2B dimers in a distal 
orientation in respect to gene pro- 
moters and proximal to DNA linker 
egions (16). Unwrapping of DNA 
(~46 bp) adds to the short 18-bp DNA 
inker length between neighboring 
nucleosomes observed in yeast. The 
ong stretches of unwrapped linker 
DNA flexibly tether INO80 through its 
Arp8 module adjacent to hexasomes, 
and its catalytic core becomes acti- 
vated upon direct recognition of non- 
canonical DNA and histone features. 
Translocation of hexasomes away 
from promoter regions is consistent 
with recent biochemical and in vivo 
mapping experiments (9). The dinu- 
cleosome model (upper panel) was 
generated by connecting two nucleo- 
somes (PDB ID: 7OHC) with an 18-bp 
B-form linker DNA; the hexasome- 
nucleosome model (upper panel) is 
generated by connecting the INO&80- 
hexasome overall architecture to a 
canonical nucleosome (PDB ID: 
7OHC). DNA unwrapping of the 


the dyad 


spin rotated 
core module 


Arp5 near 


<_~, 


H2A-H2B 


transcription 


ee 


hexasome translocation 


interface ~ 46bp unwrapping 


hexasome 


pa. 


nucleosome 
(modeled) 


flexibly tethered 
Arp8 module 


hexasome (gray) extends the linker DNA to 64 bp. The lower panel shows the overall architecture of the INO80 complex as well as the structure-based directionality 
of hexasome translocation. Interactions required for this activity are highlighted (Arp5 foot recognizing the exposed H3-H4 interface, ATPase motor on SHL-3, 
Arp5 grip near the dyad, and Arp8 module on the unwrapped linker DNA). 


revealing a previously unanticipated structural 
plasticity of the histone core. 

The multiarmed grappler element of Arp5 
recognizes noncanonical features that result 
from DNA unwrapping as well as the loss of 
H2A-H2B. The unwrapping of one nucleosomal 
DNA gyre exposes DNA at SHL+2 to SHL+4. 
The long N-terminal helix of the grappler binds 
along this surface around SHL+3 while also 
pointing toward the newly created DNA entry 
site at SHL-3 (Fig. 1D). Our prior work iden- 
tified two conformations of the grappler, one 
that appeared to be inhibitory and one that 
was permissive for translocation of canonical 
nucleosomes (23). We exclusively observe the 
permissive conformation in the context of the 
hexasome, which is consistent with an active 
mode of the remodeler. The foot helix of the 
grappler recognizes the hexameric histone core 
by packing against the H3-H4 surface adjacent 
to the central H4! switch region (Fig. 2A). The 
3D classification and focused refinement com- 
bined with Rosetta modeling (30) yielded a 
backbone structure of the helical Arp5 foot 
fitted into the helical cryo-EM density (Fig. 2, 
B and C, and fig. SSH). The Arp5 foot mimics 
the H2B a3 helix of the missing H2A-H2B 
dimer (fig. S14C) by packing against H402 
(Fig. 2C). These findings suggest not only a 
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mechanism for direct recognition of hexasomes 
by INO80 but also a dual-substrate specificity of 
the enzyme complex: The Arp5 foot helix can 
recognize both the acidic patch of a nucleosome 
(23) and the exposed H3-H4 interface at the 
proximal face of a hexasome. 

To corroborate this mechanism, we performed 
structure-based, site-directed mutagenesis 
and measured the activity of recombinantly 
produced INO80 by nucleosome remodeling 
assays. A mutation on the H2A-H2B acidic 
patch (ON80“°?™*; E61A/E64A/D72A/D90A) 
(A, Ala; D, Asp; E, Glu) was shown to abrogate 
nucleosome translocation by INO80 (Fig. 2D 
and fig. S15) (23); however, it did not com- 
promise the translocation activity of INO80 
on hexasomes (Fig. 2E and fig. S15). This shows 
that the remaining H2A-H2B acidic patch of 
the hexasome is not required for hexasome 
translocation, which is consistent with its 
solvent-exposed structure at the distal face 
of the hexasome-INO80 complex (Fig. 1B). To 
directly probe the role of the Arp5 foot, we 
mutated the two key acidic patch-interacting 
residues at the heel of the Arp5 foot (Arp5"*!: 
R501E/K502E) (K, Lys; R, Arg). These point 
mutations increased hexasome sliding activity 
but reduced nucleosome sliding (Fig. 2, D and 
E, and fig. $16). Additional mutations of the 


Arpd foot helix (Arp5?°!°', R501E/K502E + 
Q507A/R508A/M509S/K511A/1513S) (I, Te; M, 
Met; Q, Gln; S, Ser) resulted in an even more 
pronounced difference between hexasomes and 
nucleosomes (Fig. 2, D and E, and fig. $16). This 
differential effect resembles the regulatory role 
of the Arp5 foot in context of nucleosomes. 
Although mutation of the H2A-H2B acidic 
patch abolished activity, H2A.Z-mimicking 
mutations adjacent to the Arp5 foot helix in- 
creased the nucleosome sliding activity (23). 
However, mutation of the DNA-binding Arp5 
grip reduced activity on both the nucleosome 
and hexasome to a similar extent (Fig. 2, D and 
E, and fig. S16). Therefore, we conclude that 
the Arp5 grappler has a regulatory function in 
the context of hexasomes, which is similar to, yet 
mechanistically distinct from, that of nucleo- 
somes. Taken together, this shows that the 
Arp5 grappler conveys dual-substrate specifi- 
city and that direct recognition of the hexa- 
meric histone core regulates the remodeling 
activity of INO80. 


Recognition of unwrapped linker DNA by the 
Arp8 module tethers INO80 to hexasomes 


Given the extent of DNA unwrapping, we 
investigated next whether this contributes to 
hexasome recognition by INO80. Our previous 
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work identified the INO80 Arp8 module as a 
sensor for promoter and linker DNA in the 
context of nucleosome (3/7). Cryo-EM data analy- 
sis of the INO80-hexasome complex indeed 
detects DNA binding of the Arp8 module. A 
3.7-A reconstruction of the 180-kDa DNA-bound 
module (Fig. 3A and figs. S4 and S7) shows that 
it adapts a similar structure as recently deter- 
mined in the context of canonical nucleosomes 
(31, 32): Monomeric nuclear actin, Arp4, and 
Arp8 are assembled onto the helical helicase- 
SANT-associated (HSA) domain of the Ino80 
ATPase subunit, which forms, together with the 
N-terminal regions of Arp8 and Ies4, a com- 
posite binding interface for ~35 bp of curved 
DNA (fig. S10). However, in contrast to the 
canonical context, extensive 3D classification 
as well as neural network-based approaches 
failed to detect direct protein-protein inter- 
actions between the Arp8 and the core module 
of INO80. A previously observed helical con- 
nection between the HSA and the post-HSA 
domain at the ATPase motor (31, 32) is absent 
in the context of the hexasome, which suggests 
a more flexible arrangement and, potentially, 
an extensive motion of the Arp8 module along 
the unwrapped DNA. 

To directly probe this dynamic assembly, we 
analyzed the cryo-EM data at the single-molecule 
level. We mapped individual nearest-neighbor 
distances between the two modules by using 
their independently refined center-of-mass coor- 
dinates (Fig. 3B and fig. S17). The obtained 
single-particle distance distribution shows a 
prominent peak at 170 A (Fig. 3B), which reveals 
that the DNA-bound Arp8 module is distantly 
located in respect to the hexasome-bound core 
module of INO80. Integrated structural model- 
ing on the basis of the determined cryo-EM 
structures and distance derived from the single- 
molecule analysis suggests a binding mode in 
which the two modules are separated by at least 
20 bp of unwrapped DNA (Fig. 3A). The dy- 
namic configuration of the overall INO80- 
hexasome architecture explains the missing 
helical connection between the modules, and 
itis fully consistent with a disordered protein 
linker region (Ino80 protein residues 807 to 
964) that flexibly tethers the structurally ordered 
part of the HSA domain to the N-lobe of the 
Ino80 ATPase motor. Moreover, the relatively 
narrow peak distribution suggests a preferred 
linker DNA-binding site for the Arp8 module, 
which is in agreement with the curvature of 
the DNA observed by cryo-EM (fig. S10) as well 
as with its functional role in recognizing DNA 
shape and mechanics features (20, 32). Re- 
extracting particles identified by the single- 
molecule analysis enabled us to obtain 2D 
class averages that visualize the entire complex 
with both the INO80 core module bound to the 
hexasome as well as distantly located Arp8& 
module bound to linker DNA (Fig. 3B). Our 
data suggest a functional architecture in which 
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the Arp8 module flexibly tethers INO80 adja- 
cent to hexasome core particles by recognizing 
the dynamic properties of long stretches of 
unwrapped linker DNA. 


Conclusions 


Taken together, our findings suggest a struc- 
tural mechanism for recognition and remodel- 
ing of hexasomes within the chromatin landscape 
(Fig. 4). In canonical yeast gene bodies, the 
activity of INO80 is most likely limited by a 
short 18-bp DNA linker length (9, 33). DNA 
unwrapping upon the formation of hexasome 
extends linker DNA and gives access to INO80, 
which turns hexasomes into active substrates 
on chromatin landscape with otherwise limited 
linker DNA length (Fig. 4 and fig. S3B). Recent 
biochemical work and chromatin mapping in 
yeast identified a role of INO80 in chromatin 
(re)organization of actively transcribed genes 
(9), which lose H2A-H2B upon transcription 
elongation (16). Our data provide a structural 
basis to understand the underlying mecha- 
nism. INO80 captures the proximal face of 
the hexasome in an unwrapped state. Spin 
rotation of the core module as well as direct 
sensing of noncanonical histone and DNA 
features explain the activation of INO80’s 
remodeling machinery, and the Arp8 module 
tethers the complex adjacent to the hexasome 
at extended stretches of unwrapped linker 
DNA. The predicted directionality of hexasome 
translocation away from gene promoter regions 
is consistent with the measured activity in vivo 
(9). Our data reveal an unanticipated plasticity 
of the nucleosome as well as of the ATP- 
dependent chromatin remodeler machinery. 
The synergy of both creates locally a permissive 
remodeling environment that helps explain 
versatile, yet specific, reconfiguration of the 
chromatin landscape. 

A wealth of studies identified the H2A-H2B 
acidic patch of the canonical nucleosome as a 
general binding platform for chromatin fac- 
tors and a “hotspot” of chromatin biology (3, 25). 
In fact, structural insights into remodelers in the 
context of the hexasome were so far restricted 
to a fragment of human ALC1 bound to the 
H2A-H2B acidic patch (34). We show that 
INOS80 detects the loss of H2A-H2B by recog- 
nizing the exposed H3-H4 interface in the 
context of a spin-rotated hexameric histone 
core. Our findings thus reveal a mechanism 
that is independent of the acidic patch and 
identify the hexasome as a distinct type of 
substrate for chromatin remodeling. 

The principles discovered for the hexasome 
and INO80 may apply more generally. Recent 
studies identified a prominent functional role 
of RSC and other SWI/SNF family remodelers 
in the context of hexasomes and other non- 
canonical nucleosomal particles (7, 8, 16). On 
the basis of our structural insights for INO80, 
we modeled SWI/SNF remodelers on hexasomes 


(fig. S18) and found that their multi-subunit 
architectures (35-37) are also compatible with 
hexasome recognition by means of a similar 
145° “spin-rotation” principle. In such a model, 
RSC recognizes the hexasome in an analogous 
manner to INO80 by contacting unwrapped 
linker DNA, the spin-rotated H3 a1L1 elbow, 
and the exposed H3-H4 interface (fig. S18). 
RSC has been recently shown to generate hexa- 
somes by actively ejecting H2A-H2B dimers 
upon histone acetylation (7). On the basis of 
our structural insights, a similar model is con- 
ceivable for INO80 but will require future 
studies that include histone modifications 
and variants as well as the interplay with his- 
tone chaperones, other remodelers, and the 
transcriptional machinery. Notably, our find- 
ings imply that the positions of regulatory 
histone modifications are substantially changed 
in the context of spin-rotated complexes, which 
illustrates how hexasome remodeling may 
provide an access to a different, yet unexplored 
layer of chromatin regulation. 
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Reorientation of INO80 on hexasomes reveals basis 


for mechanistic versatility 


Hao Wut, Elise N. Mufioz**+, Laura J. Hsieh’, Un Seng Chio’, Muryam A. Gourdet2*, 


Geeta J. Narlikar, Yifan Cheng'** 


Unlike other chromatin remodelers, INO80 preferentially mobilizes hexasomes, which can form during 
transcription. Why INO80 prefers hexasomes over nucleosomes remains unclear. Here, we report structures of 
Saccharomyces cerevisiae INO80 bound to a hexasome or a nucleosome. INO80 binds the two substrates 

in substantially different orientations. On a hexasome, INO80 places its ATPase subunit, Ino80, at superhelical 
location -2 (SHL -2), in contrast to SHL -6 and SHL —7, as previously seen on nucleosomes. Our results 
suggest that INO80 action on hexasomes resembles action by other remodelers on nucleosomes such that 
Ino80 is maximally active near SHL -2. The SHL —2 position also plays a critical role for nucleosome remodeling 
by INO80O. Overall, the mechanistic adaptations used by INO80 for preferential hexasome sliding imply that 
subnucleosomal particles play considerable regulatory roles. 


n eukaryotes, central nuclear processes 

such as gene expression, DNA replication, 

and DNA repair are coordinated with dy- 

namic changes in chromatin states (7-3). 

ATP-dependent chromatin-remodeling en- 
zymes play essential roles in catalyzing such 
changes. These enzymes are broadly catego- 
rized into four major families: SWI/SNF, ISWI, 
CHD, and INO80 (4, 5). Each of these enzymes 
contains a core remodeling ATPase subunit 
and several auxiliary subunits that regulate the 
core ATPase. It has typically been presumed 
that the preferred substrate of these enzymes 
is anucleosome, the smallest unit of chromatin 
containing ~147 base pairs (bp) of DNA wrapped 
around an octamer of histone proteins (6). Con- 
sistent with this assumption, between them, 
these four classes slide the histone octamer, 
exchange histone variants, and transfer en- 
tire octamers (5, 7). 

The INO80 complex has been shown to play 
roles in regulating transcription, DNA repli- 
cation, and DNA repair (8-17). However, how 
INO80’s biochemical activities relate to its 
diverse biological roles is not well understood. 
Unlike remodelers from other families, in 
which the core ATPase subunits bind the nu- 
cleosome near superhelical location 2 (SHL +2 
or SHL —2), Ino80, the core ATPase subunit of 
the INO80 complex, binds nucleosomes near 
SHL -6 or SHL -7 (fig. SIA) (72-14). It has been 
speculated that this key difference in nucleo- 
some engagement reflects a fundamentally dif- 
ferent remodeling mechanism (J5, 16). Indeed, 
we showed that the preferred substrate of the 
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Saccharomyces cerevisiae INO80 complex is 
not a nucleosome but a hexasome, which is a 
subnucleosomal particle that lacks a histone 
H2A-H2B dimer (17). Hexasomes are gener- 
ated during transcription and may also be 
formed during DNA replication and repair 
(18-21). Further, INO80’s activity on nucleo- 
somes is more dependent on flanking DNA 
length than on hexasomes (17, 22). These re- 
sults suggested that INO80 has the versatility 
to act on hexasomes or nucleosomes based on 
the density of nucleosomes and hexasomes at 
a given locus. However, fundamental mecha- 
nistic questions remain. It is not clear how 
INO80 can act on both nucleosomes and hex- 
asomes, which differ substantially in their 
structures. It is also unclear why INO80 has 
different flanking DNA length dependencies 
on hexasomes versus nucleosomes. 

Here, we report cryo-electron microscopy 
(cryo-EM) structures of endogenously purified 
S. cerevisiae INO80 bound to a hexasome and 
a nucleosome. We found that INO80 binds 
hexasomes and nucleosomes in opposite ori- 
entations, with Ino80 binding near SHL -2 on 
hexasomes and near SHL -6 or SHL -7 on 
nucleosomes. The location of the Arp8 module 
suggests how flanking DNA length differen- 
tially regulates nucleosome and hexasome slid- 
ing. DNA gaps near SHL -2 inhibit sliding of 
both substrates by INO80. Our findings pro- 
vide mechanistic insights into how INO80 
slides both hexasomes and nucleosomes. 


Structures of the INO80-hexasome and 
INO80-nucleosome complexes 


To visualize how INO80 binds to a hexasome 
or a nucleosome, we prepared hexasomes and 
nucleosomes on the same DNA templates 
containing the 147-bp, 601-nucleosome posi- 
tioning sequence with 80 bp of additional 
DNA as described previously (+80H and 
+80N; with definition explained in Fig. 1A; 
fig. S1, A and B; and the supplementary text) 


6 


(17, 23, 24). Complexes were formed by | Se 
ing hexasomes or nucleosomes with endvs- 
nously purified S. cerevisiae INO80 without 
adding nucleotide (fig. S1, C to H). 

We determined cryo-EM structures of the 
INO80-hexasome complex in three different 
conformational snapshots (Fig. 1, B and C, 
and figs. S2 to S6). The overall shape of 
INO80 is similar within these structures and 
also to previously published structures of the 
nucleosome in complex with human (12) and 
Chaetomium thermophilum (14) INO80. Using 
prior convention, we grouped subunits of the 
INO80 complex into four modules: the Rvb mod- 
ule (Rvb1/Rvb2), the Arp8 module (Arp8/Arp4/ 
Actin/Tes4 and Tafl4), the Ino80 module (Ino80/ 
Tes2), and the Arp5 module (Arp5/Tes6). The Ino80 
protein consists of three major regions: the N- 
terminal domain, the HSA region (Inogo™), 
and the ATPase domain (Ino80“7"***). Detailed 
descriptions of these modules in our structures 
are provided in the supplementary text. 

Although the INO80 architecture appears 
similar to that in the INO80-nucleosome struc- 
tures, a major difference is that it is rotated 
~180° on a hexasome compared with a nu- 
cleosome (Fig. 1, B to E). We identified two 
primary interactions between INO80 and the ‘ 
hexasome: Ino8047?*S* binds the hexasome 
near SHL -3 (class 1), SHL -2.5 (class 2), or 
SHL -2 (class 3), and the Arp5/Ies6 module 
binds near SHL +1, SHL +1.5, or SHL +2 (fig. + 
S6, A and B), respectively. Class 3 is the predom- 
inant INO80-hexasome class. All Inogo“'@s¢ 
locations on hexasomes are different than 
those on nucleosomes, which are near SHL -6 
or SHL -7 (12-14). However, the Ino80 orien- 
tation on hexasomes is consistent with struc- 
tures of other major chromatin remodelers 
on nucleosomes such as S. cerevisiae ISW1 
(25-27), Chdl1 (28-30), RSC (31-33), Snf2 (34), 
and in particular the SWRI complex (35), + 
which is from the same subfamily as the INO80 ‘ 
complex. In these structures, the ATPase do- 
mains interact with nucleosomes near either 
SHL +2 or SHL -2 (Fig. 1E). , 

Loss of an H2A-H2B dimer in a hexasome 
causes an additional ~35 bp of DNA to un- 
wrap from the histone core (free DNA) (Fig. 
1A and fig. S1B). Comparison of our hexa- 
some structures with an unbound hexasome 
(PDB: 6ZHY) (36) reveals different levels of 
further DNA unwrapping. In class 1, the hex- 
asome is almost identical to the unbound 
hexasome, without detectable additional DNA 
unwrapping. The level of DNA unwrapping 
increases as the Ino80°""***-binding position 
changes from SHL -3 (class 1) to SHL -2 
(class 3) (Fig. 2 and fig. S6C). 

For comparison, we also determined structures 
of S. cerevisiae INO80 bound to a nucleosome 
and captured two conformational snapshots 
(class 1 and 2) from the same dataset (figs. S7 
to S9 and supplementary text). Ino8047?** 
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Fig. 1. Structure of the INO80-hexasome complex reveals large rotation. 
(A) Cartoon illustration of a +X nucleosome and a +X hexasome. H2A-H2B dimer 
proximal to the flanking DNA (entry side dimer) is shown in cyan; H3-H4, light 
gray; 601 DNA, dark gray; flanking DNA, orange; additional free (unwrapped) 
DNA is also shown in cyan; SHLs are shown as yellow dots; DNA from the bottom 
gyre is shown as a dotted line. (B) Two different views of cryo-EM density map 
of the INO80-hexasome complex (class 3). (€) Atomic model of the INO80- 
hexasome complex (class 3) viewed in the same orientation as the map is viewed 
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in (B). (D) Cryo-EM density map of the C. thermophilum INO80-nucleosome 
complex [EMDB: 4277 (14)] displayed with its nucleosome dyad and H3-H4 
tetramer aligned with that of the hexasome in the right panel of (B). Note that 
INO80 on a hexasome rotates ~180° from where it sits on a nucleosome 

when keeping the nucleosome-hexasome dyad and H3-H4 aligned. (E) Structural 
comparisons of INO&80-nucleosome complex (left), the SWR-nucleosome 
complex (middle), and the INO80-hexasome complex (right), with the dyad and 
H3-H4 of nucleosome and hexasome aligned. 
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hexasome (PDB: 6ZHY, gray), showing the degree of 


in class lis located near SHL -7, similar to its 
location in the human INO80-nucleosome 
structure (72), whereas in class 2, it binds near 
SHL -6, similar to the C. thermophilum struc- 
ture (14) (fig. S9, A and C). The Arp5/Ies6 
module interacts with the nucleosome near 
SHL -3 and SHL -2 (fig. S9D), respectively. 
These observations are also consistent with 
previous findings showing that nucleosomal 
DNA between SHL -5 and SHL -7 is protected 
by INO80 (13). 


The SHL -2 position plays a critical role in 
nucleosome and hexasome sliding 


We observed that Ino80“7"*** engages the hex- 
asome predominantly near SHL -2. These 
results raise the possibility that Ino80“7"*°° 
acts near SHL -2 when sliding hexasomes. 
By contrast, consistent with prior findings 
(12, 14), we observed that Ino80“7?*** engages 
the nucleosome near two positions, SHL -7 
and SHL -6. Also as previously proposed, our 
findings are consistent with the possibility that 
Inogo“7?** acts near SHL -6 when sliding 
nucleosomes (13). A commonly used assay to 
identify the DNA location from which the 
ATPase domain of a remodeler acts to trans- 
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locate DNA is to place a single nucleotide gap 
at the proposed site of action and test whether 
the gap inhibits DNA translocation (37-39). 
Therefore, to directly test the importance of 
the SHL -6 and SHL -2 locations, we as- 
sembled nucleosomes and hexasomes with 
single base gaps near SHL -2 or SHL -6 and 
measured INO80 activity using a gel-based 
sliding assay (Fig. 3A). 

We found that a gap at SHL -6 inhibited 
INO80’s sliding activity on nucleosomes by 
~200-fold, but so did a gap at SHL -2 (Fig. 3, 
B to G). By contrast, a gap at SHL -6 did not 
inhibit INO80’s sliding activity on hexasomes, 
but a gap at SHL -2 slowed hexasomes slid- 
ing by ~2000-fold (Fig. 3, B to G). These 
results are consistent with Ino80°""** acting 
near SHL -2 when sliding hexasomes and 
raise new questions about why both the SHL -2 
and SHL -6 locations are critical for nucleo- 
some sliding by INO80. We describe possible 
explanations in the Discussion. 


Role of the Arp8 module in flanking DNA 
length dependence 

S. cerevisiae INO80 slides +40 nucleosomes 
~100-fold more slowly than +80 nucleosomes 


DNA unwrapping (top row) and binding locations of Ino80*'?2s* 


and Arp5 (bottom row). 


(17, 22). However, sliding hexasomes is less 
flanking DNA dependent. Our structures sug- 
gest that the Arp8 module requires ~40 bp 
of DNA for appropriate engagement. In class 
1 of the INO80-hexasome structure, Arp8 en- 
gages with the ~35 bp of DNA unwrapped 
from removal of the H2A-H2B dimer and an 
additional ~5 bp of flanking DNA. In class 3 
of the INO80-hexasome structure, the Arp8 
module engages entirely with ~40 bp of un- 
wrapped DNA that now includes additional 
DNA unwrapped relative to the unbound hex- 
asome (Fig. 4). Conversely, in class 2 of the 
INO80-nucleosome structure, the Arp8 mod- 
ule engages entirely with flanking DNA, which 
is consistent with previous findings (40) (Fig. 
4). Our structural data with hexasomes, along 
with the previous data with nucleosomes, 
suggest that 40 bp may be the minimum 
amount of DNA needed for the Arp8 module 
to bind and that proper Arp8 module engage- 
ment is essential for maximal remodeling 
activity (40). 


Altered interactions by the Arp5 module 


To understand why Ino80 may not bind a nu- 
cleosome directly near SHL -2, we compared 
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Fig. 3. Inhibition of DNA translocation at specific SHL sites influences 
nucleosome and hexasome sliding by INO80. (A) Cartoon illustration of a 
+80 nucleosome (left) and a +80 hexasome (right) with approximate locations 
of site-specific single base gaps indicated. Colors are the same as in Fig. 1A. 
(B and C) Example gels and time courses of native gel-based remodeling assays 
of WT INO80 on +80 nucleosomes with no gap, a gap near SHL -2, and a gap 
near SHL -6. (D and E) Example gels and time courses of native gel-based 


interactions made by Arp5/Tes6 in hexasomes 
versus nucleosomes (see the supplementary 
text). When INO80 binds to a hexasome, the 
Arp5/Ies6 regions used in the context of a nu- 
cleosome are repurposed for different interac- 
tions. Modeling the missing H2A-H2B dimer 
into our INO80-hexasome structure reveals 
steric clashes of the Arp5 module with the en- 
try side proximal H2A-H2B dimer and with 
part of the DNA that wraps around the H2A- 
H2B dimer (fig. S11). These clashes could 
be avoided if the H2A-H2B dimer is suffi- 
ciently dislodged. To test for this possibility, 
we inhibited dimer dislodgement by intro- 
ducing a site-specific disulfide cross-link be- 
tween the two H2A molecules (N38C) (41) or 
promoted dimer dislodgement by using an 
H2A mutant (R81A) that destabilizes the H2A- 
H2B/H3-H4 interface (42) (fig. S12, A, B, and 
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remodeling assays of wild- 
of INO80 sliding activity. ko 
1.01 + 0.1668; +80H Gap @ 


performed under single-tu 


H). The disulfide cross-link did not inhibit nu- 
cleosome sliding, and the H2A mutant did not 
promote nucleosome sliding (fig. S12, C to G), 
indicating that complete dimer dislodgement 
is not necessary for INO80-mediated nucleo- 
some sliding. In the absence of dimer dislodge- 
ment, another way to avoid these clashes could 
be by substantial rearrangement of the Arp5 
module together with subtle rearrangements 
of the H2A-H2B dimer (fig. S9E). 


Discussion 
Implications of the INO80-hexasome structure 
for nucleosome sliding by INO80 


The major conformation of the INO80-hexasome 
complex (class 3) has Ino80“7??°* near SHL 
-2 and ~15 bp of unwrapped DNA from the 
entry site in addition to the ~35 bp of DNA 
that is unwrapped from removal of an H2A- 


type INO80 on +80 hexasomes with no gap, a gap 


near SHL -2, and a gap near SHL -6. (F and G) Average observed rate constants 


ps (min™): +80N: 1.551 + 0.1846; +80N Gap @ SHL -2: 


0.005995 + 0.001054; +80N Gap @ SHL -6: 0.006497 + 0.0007117; +80H: 


SHL -2: 0.000379 + 0.0002849; +80H Gap @ SHL -6: 


1.213 + 0.2209. Data represent the mean + SEM for three technical replicates 


nover conditions with saturating enzyme and ATP. 


H2B dimer. The placement of Inog0“7?"* near . 
SHL -2 is consistent with how the ATPase 
subunits of other remodelers bind the nu- 
cleosome. Together with our prior finding that 
hexasomes are remodeled faster than nucleo- 
somes, these results suggest that the class 3 
structure represents the sliding-competent 
conformation of INO80 on hexasomes (Fig. 5A 
and fig. S13A). By contrast, the states of INO80 
bound to a nucleosome have Inog0“7?*** bound 
near either SHL -6 or SHL -7, also consistent 
with previous findings. These differences raise 
the question of whether the INO80-nucleosome 
structures represent sliding-competent confor- 
mations or if a rearrangement of Ino80“7?** to 
SHL -2 is necessary to achieve efficient nucleo- 
some sliding. 

Previous cross-linking studies have shown 
that detachment of nucleosomal DNA from 
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H2A-H2B close to the entry site occurs during 
INO80 remodeling (13). Our data show that 
progressively more DNA is unwrapped as 
Inos0“'?*s* binds closer to SHL -2 on hex- 
asomes (Fig. 2 and fig. S6C). Together, these 
results suggest that DNA unwrapping is 
coupled to Inog047?°* accessing its most 
sliding-competent state. Footprinting studies 
have shown that whereas binding of INO80 
to nucleosomes mainly protects nucleosomal 
DNA from SHL -5 to SHL -7 and near SHL -3, 
there is modest but detectable protection 
near SHL -2 (13). Nicks and gaps between SHL 
-7 and SHL -2 have been shown to inhibit 
nucleosome sliding to different extents (13, 43). 
Here, we show that site-specific gaps near 
SHL -2 or SHL -6 substantially inhibit INO80’s 
sliding of nucleosomes (by ~200 fold). DNA 
gaps are commonly used to identify the site of 
action of the ATPase domain of remodelers 
(37-39). We therefore speculate that INO80 | 
initially binds the nucleosome with Inog04“7?*°° 
near SHL -6 or SHL -7, and that this is fol- 
lowed by an ATP-dependent rotation around 
the nucleosome to position Ino80“7"*** near 
SHL -2, from which Ino80“'** then trans- 
locates nucleosomal DNA (Fig. 5B and fig. 
S14A). A gap near SHL -6 would then inhibit 
ATP-dependent movement of Inog0“7?*** on 
the nucleosome, whereas the gap near SHL -2 
would inhibit translocation of nucleosomal 
DNA by INO80 relative to the histone octa- 
mer (fig. S14). Single-molecule fluorescence 
resonance energy transfer studies have iden- 
tified an ATP-dependent pause phase be- 
fore ATP-dependent nucleosome sliding (22). 
The pause could represent the reorientation 
of Inos0“'?*** from SHL -6 or SHL -7 toward 
SHL -2 and add a step that slows remodeling 
of nucleosomes compared with hexasomes. 
Simply placing the INO80 complex as is on 
nucleosomes with the Ino80“'?*** near SHL 
-2 results in steric clashes of the Arp5 module 
with the nucleosome (fig. S11). Although partial 
H2A-H2B dimer dislodgment, as previously 
proposed (17), could avoid such clashes, our 
biochemical data here indicate that dimer dis- 
lodgement is not essential for nucleosome 
sliding by INO80 (fig. $12). More structural 
studies are needed to understand how INO80 
might rotate around a nucleosome. 

Alternatively, a gap near SHL -2 may affect 
the action of the Arp5 module. For such a 
scenario, we speculate that Ino80“7*** trans- 
locates DNA near SHL -6, and effective trans- 
location also requires action of the Arp5 
module near SHL -2, as previously proposed 
(12, 14). A gap near SHL -6 would then in- 
hibit translocation of nucleosomal DNA by 
Inos0“7"**°, and a gap near SHL -2 would 
inhibit productive engagement of the Arp5 
module (fig. S15). 

Clearly distinguishing between these two mod- 
els will require substantial additional structural 
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analysis of INO80-remodeling intermediates on 
nucleosomes. 


Implications for hexasome sliding by INO80 


Our structures provide a view into how INO80 
engages a hexasome. In the predominant 
INO80-hexasome structure, Ino8047"** binds 
near SHL -2. A site-specific gap near SHL -2 
substantially inhibits INO80’s sliding of hex- 
asomes (~2000 fold), whereas a gap near 
SHL -6 does not have a major effect. We 
therefore hypothesize that Ino80“'?*** bound 
at SHL -2 on a hexasome represents the active 
structure. Compared with the subtle changes 
at SHL -2 observed when other remodelers 
bind nucleosomes (J6), the additional 15 bp of 
unwrapped DNA (up to SHL -2.5) in class 3 
substantially loosens histone DNA interactions 
and thus may allow more ready translocation 
from SHL -2. We further propose that the new 
contacts made by the Arp5/Ies6 module with 
the exposed H3-H4: surface provide an anchor 
that allows the Ino80 motor to efficiently 
pump DNA through the hexasome. These 
findings also explain the differential effects 
of the Arp5/Ies6 module on hexasome versus 
nucleosome sliding (17). The location of the 
Arp8 module is also different on hexasomes 
than on nucleosomes. On nucleosomes the 
Arp8 module binds ~40 bp entirely on the 
flanking DNA (Fig. 4). In the most prevalent 
INO80-hexasome state (class 3), the Arp8 
module is bound entirely to the unwrapped 
DNA, substantially reducing the need to bind 
flanking DNA (Fig. 4). These different binding 
modes of the Arp8 module could explain why 
hexasome sliding by INO80 is less dependent 
on flanking DNA length compared with nu- 
cleosome sliding. 
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Programming correlated magnetic states with 
gate-controlled moiré geometry 


Eric Anderson’, Feng-Ren Fan", Jiaqi Cai‘, William Holtzmann’, Takashi Taniguchi*, Kenji Watanabe°, 


Di Xiao*, Wang Yao***, Xiaodong Xu®* 


The ability to control the underlying lattice geometry of a system may enable transitions between emergent 
quantum ground states. We report in situ gate switching between honeycomb and triangular lattice geometries 
of an electron many-body Hamiltonian in rhombohedral (R)-stacked molybdenum ditelluride (MoTe2) moiré 
bilayers, resulting in switchable magnetic exchange interactions. At zero electric field, we observed a correlated 
ferromagnetic insulator near one hole per moiré unit cell with a widely tunable Curie temperature up to 

14 K. Applying an electric field switched the system into a half-filled triangular lattice with antiferromagnetic 
interactions; further doping this layer-polarized superlattice tuned the antiferromagnetic exchange interaction 
back to ferromagnetic. Our work demonstrates R-stacked MoTez moirés to be a laboratory for engineering 


correlated states with nontrivial topology. 


he physical properties of crystalline solids 

are fundamentally determined by their 
lattice structure. The ability to control lat- 

tice parameters would thus enable access 

to a complex electronic phase diagram. 
Moiré superlattices of two-dimensional (2D) 
van der Waals crystals have recently emerged 
as powerful synthetic quantum materials ca- 
pable of achieving designer Hamiltonians with 
controllable superlattice constants, layer stack- 
ing arrangements, and Coulomb interaction 
strengths (2). So far, a plethora of correlated and 
topological electronic states have been demon- 
strated in the triangular lattice (2-11). However, 
the honeycomb lattice, a model system for in- 
vestigating strongly correlated phenomena, 
remains to be explored. In addition, in situ con- 
trollable superlattice geometry has not been 
realized, which would be a fundamentally differ- 
ent approach to electrically tuning phase tran- 
sitions between states with distinct symmetries. 
Rhombohedral (R)-stacked transitional-metal 
dichalcogenide (TMD) moiré bilayers may offer 
such opportunities (12-77). As shown in Fig. 1A, 
the moiré potential has two degenerate energy 
minima within a supercell, the MX (B sublat- 
tice) and XM (C sublattice) sites, where MX 
denotes the transition-metal atoms (M) of one 
layer sitting atop of chalcogen atoms (X) of the 
other. The corresponding moiré orbitals are 
localized in opposite layers, forming a honey- 
comb lattice, with the sublattice pseudospin 
locked to the layer pseudospin (Fig. 1B). The 
application of a vertical electric field induces 
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layer polarization and breaks energy degeneracy 
between the moiré orbitals localized in the B 
and C sublattices, leading to a transition from 
honeycomb to triangular lattice symmetry (Fig. 
1B). Thus, at a doping of one hole per moiré site, 
this electric field switches the hole lattice be- 
tween a two-orbital, quarter-filled honeycomb 
lattice to a one-orbital, half-filled triangular lattice, 
when taking into account spin degeneracy. 
The R-stacked homobilayer moiré has been 
theoretically predicted to host an array of in- 
triguing phenomena, including the quantum 
spin Hall effect, ferroelectric Mott insulators, 
integer and fractional quantum anomalous 
Hall states, and electric field-induced electronic 
phase transitions (13, 14, 18-25). Experimentally, 
moiré ferroelectricity (26), interlayer exciton- 
electric polarization coupling (27), correlated 
electronic phases (28, 29), and quantum crit- 
icality near one hole per moiré unit cell [filling 
factor (v) = -1] (30) have been reported in 
twisted R-stacked bilayers. In this work, we show 
that R-stacked twisted bilayer molybdenum 
ditelluride (MoTe,) is a model system for ex- 
ploring interaction-induced magnetism with 
electrically tunable moiré geometry. 


Correlated insulating states on a 
honeycomb lattice 


We fabricated near-4° twisted MoTe, bilayers 
in a dual gated structure, which allowed in- 
dependent control of carrier density n and 
displacement field D. There is a small built- 
in displacement field of 0.04 V/nm in the 
device, which was likely caused by imperfect 
sample fabrication. For simplicity, an effective 
D with this built-in field offset subtracted is 
presented in the rest of this paper (37). To char- 
acterize the device properties, we performed 
photoluminescence (PL) (Fig. 1C) and optical 
reflectance (fig. S1) measurements versus 7 at 
fixed D = 0 V/nm. The experimental temper- 
ature was 1.6 K unless otherwise specified. 
Comparing this twisted bilayer PL with doping- 
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q 


dependent monolayer data taken under i Sie 
tical conditions (Fig. 1D), we observed sir. — 
general behavior: a charge-neutral exciton near 
zero gate voltage, and trions that appear as 
gating induces electrostatic doping (32). 

The similarity of the monolayer and bilayer 
doping-dependent PL data suggests that twisted 
MoTe, bilayer is a direct bandgap semicon- 
ductor. Similar direct bandgap nature has been 
observed in as-exfoliated 2H-stacked bilayer 
MoTe, (33, 34), where band-edges are at the 
corners of the hexagonal Brillouin zone (or 
valleys). The direct-bandgap property is distinct 
from other TMD homo- and heterobilayers, 
which have an indirect bandgap. Therefore, in 
twisted MoTe, bilayers, the valley composition 
is unambiguous for both charged carriers and 
excitons. The latter is particularly appealing as 
a sensitive optical probe of correlated physics, 
owing to the sharpness of the exciton spectra 
(~5 meV) (fig. S2). 

Distinct from the monolayer, however, the 
trion PL in the twisted bilayer displayed clear, 
sharp energy jumps at finite values of doping. 
This observation resembles the behavior of 
the interlayer exciton in a tungsten disulfide- 
tungsten diselenide (WS,-WSe,) heterobilayer, 
which has been used to probe the formation of ‘ 
correlated electronic states at integer and frac- 
tional filling in moiré superlattices (8, 35-37). 
By comparing PL and reflectance data, we could 
identify the integer filling factor (v = 1,2,-1) and + 
infer a twist angle of about 3.9° [(3D, section 1.3]. 
The sharpness of the integer filling factor fea- 
tures indicates that the twist angle, and thus 
the moiré superlattice wavelength, does not 
change substantially within the beam spot. A 
notable feature of the doping-dependent PL is 
the drop in trion intensity at integer fillings 
|v| = Land 2. This supports the interpretation 
of insulating state formation because this would 
reduce the free carrier population needed to + 
form trions, thus suppressing the trion PL. ‘ 


Observation of ferromagnetism near v = -1 


We performed reflective magnetic circular di- . 
chroism (RMCD) measurements to investigate 
magnetic interactions. The optical excitation 
was chosen at 1.12 eV with a bandwidth of 
30 meV (wavelength dependence is provided 
in fig. S3). The RMCD signal is shown in Fig. 1E 
as a function of v and D. The data were taken 
by first initializing the sample at out-of-plane 
magnetic field u,H = 0.5 T and then sweep- 
ing back to zero field. Remnant RMCD signal, 
the signature of a ferromagnetic state, is pro- 
nounced on the hole side in the vicinity of v = -1. 
As we demonstrate below, this is a 2D magnetic 
phase diagram over v and D. The focus of this 
work is to understand the phase diagram. 

We began with investigation of the mag- 
netic state at D = 0, in the honeycomb lattice. 
The RMCD signal is shown in Fig. 2A at v = -1 
versus out-of-plane u,H swept down and up. 
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bilayer in R-stacking. High symmetry points with local energy minima are photon energy. (D) Doping-dependent PL of monolayer MoTes, showing 
highlighted. The green circles correspond to MX sites (B sublattice), where exciton (X°) and trion (X* and X-) features. (E) Reflective magnetic circular 
the metal atom M in the top layer is aligned with the chalcogen atom X in the dichroism (RMCD) signal intensity plot as a function of v and D of the twisted 
bottom layer. Orange circles are the corresponding XM sites (C sublattice). bilayer without magnetic field. The nonzero RMCD signal was observed close 
Dotted lines indicate a single moiré unit cell. (B) Application of vertical electrical to v = -1 and symmetric in D. The plot represents a 2D ferromagnetic 
field (D) lifts the energy degeneracy of the layers and switches the moiré phase diagram. All data were taken at T = 1.6 K. 


We observed a pronounced hysteresis loop with 
a width of about 40 mT, a hallmark of ferro- 
magnetism. Sharp switching of the RMCD signal 
near the critical field implies a spin-flip transition 
with an out-of-plane easy axis. The temperature 
dependence of RMCD versus u,H is shown in 
Fig. 2B, further confirming the ferromagnetic 
state. Both the hysteresis loop width and rem- 
nant RMCD signal decreased as temperature 
increased, and eventually vanished above the 
Curie temperature (7c) of about 14 K. 

The observed ferromagnetic state near v = -1 
is consistent with the ground state of a quarter- 
filled honeycomb lattice, gapped by the next- 
nearest-neighbor complex hopping in twisted 
R-type homobilayers (J4). We obtained the 
spin-resolved band structure from Hartree- 
Fock calculations (37). As shown in Fig. 2C, a 
fully spin-polarized valence band was ob- 
tained within this framework. Calculations 
show that at D = 0, hole density is equally 
distributed between the B sublattice in one 
layer and C sublattice in the other layer (Fig. 
2D). The B and C orbitals have appreciable spa- 
tial overlap where the carrier wave function 
becomes layer hybridized, such as at the A 
corners of the moiré unit cell and the middle 
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points between B and C. It is the Coulomb 
exchange through the spatial overlap that 
gives rise to nearest-neighbor ferromagnetic 
interactions. The calculations also found that 
the spin band is topological with a Chern 
number of 1; v = -1 is a quantum anomalous 
Hall (QAH) insulator. This result is consistent 
with theoretical predictions that a QAH state 
can be realized in a quarter-filled honeycomb 
lattice (38) such as graphene. The existence of 
a QAH state has also been predicted in twisted 
MoTe, bilayers under similar conditions (13, 21). 
In our calculations, a larger value of the moiré 
potential than in (13) was used, but one which 
is consistent with the value extracted from 
large-scale density functional theory (DFT) 
calculations with lattice reconstruction (39). In 
addition, another study that leveraged Hartree- 
Fock methods to explore the magnetic ground 
states of this system found results that were 
consistent with our observations (40). 


Doping and electric field control of magnetic 
phase transitions 


The ferromagnetic state can be tuned by means 
of electrostatic doping. Fig. 2E, top and bottom, 
are the RMCD intensity plots versus v, for u,H 
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swept down and up, respectively (line cuts are 
provided in fig. S4). The difference of the two 
plots yields the residual RMCD signal and 
hysteresis loop width. As shown in Fig. 2F, the 


ferromagnetic state exists for the range of v be- « 
tween -0.5 and -1.3, indicating a possible tran- ‘ 


sition from ferromagnetic insulator at v = -1 
to ferromagnetic metal upon doping. The hys- 


teresis loop width appeared jumpy over chang- . 


ing v and shrank near the phase space boundary. 
The jumpiness was likely caused by domain 
dynamics near the spin flip transition. We 
performed temperature-dependent RMCD 
at selected filling factors within this v range 
(fig. S5). The extracted 7c varied by a factor 
of four, from about 14 K near v = -1 to 3 K 
(limited by our base temperature of 1.6 K) near 
the phase boundary (Fig. 2G). This result dem- 
onstrates strongly doping-dependent ferro- 
magnetic properties. 

We next measured the RMCD signal as a 
function of D at v = -1. Figure 3, A and B, are 
the RMCD intensity plots versus u,H swept 
down and up, respectively. The difference be- 
tween them (Fig. 3C) demonstrates the rem- 
anent RMCD signal and hysteretic effects, 
which highlight the electric field tunability of 
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Fig. 2. Ferromagnetism near quarter-filled honeycomb lattice. All data 
were taken at D = 0. (A) RMCD signal versus out-of-plane magnetic field 
(gH) swept down and up at v = -1. The observed hysteretic behavior 
demonstrates the ferromagnetic state. (B) Temperature-dependent RMCD at 
v = -l, showing behavior typical of a ferromagnetic state. Data are offset for 
clarity. (€) Calculated valley-resolved moiré band structure in 3.9° twisted 
bilayer MoTes. The solid red and blue dashed lines indicate spin up and down 
bands, respectively. The black dashed line indicates the chemical potential. 


(D) Calculated hole density spatial distribution over the moiré unit cell at v = -1. 


Density is distributed evenly between the bottom layer at the B sites and the top 
layer at the C sites, with appreciable spatial overlap between the nearest-neighbor B 
and C sites. Color saturation corresponds to normalized hole density. (E) RMCD signal 
intensity plot versus filling factor v and u.H swept down (top) and up (bottom). 

(F) Difference of (E) top and bottom, giving the hysteretic component of the RMCD. 
This highlights the filling-factor phase space for the ferromagnetic state. (Inset) The 
blue dotted line in the 2D phase diagram corresponds to the range of v in (E) and (F). 
(G) Filling factor-dependent Curie temperature. Error bars correspond to the 
temperature sampling resolution. Data in (A), (E), and (F) were taken at T = 1.6 K. 


the ferromagnetic state. The hysteresis loop 
width remained finite as D increased, until the 
hysteretic behavior finally vanished for D val- 
ues larger than ~0.2 V/nm. The Curie tem- 
perature was correspondingly tuned upon 
increasing D by a factor of four (Fig. 3D). The 
suppression of the ferromagnetic interaction 
was caused by the charge redistribution be- 
tween the layers as D field increased. Above 
the critical displacement field value, full layer 
polarization developed (fig. S6). At this value 
of D, the on-site energy difference between 
the two honeycomb sublattices due to the dis- 
placement field was sufficiently large that all 
carriers became confined to a single-layer sub- 
lattice, with the sublattice of the opposite layer 
empty. In this regime, Coulomb exchange be- 
tween the filled sites (either B or C, depending 
on the direction of the applied D field) was 
quenched by their larger separation compared 
with that of the honeycomb lattice. The sys- 
tem is reminiscent of the extensively discussed 
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triangular lattice Hubbard model in hetero- 
bilayer TMD moirés (2-7). 

To reveal the magnetic interactions in the 
fully layer pseudospin polarized triangular 
lattice, we performed RMCD measurements 
versus LH as a function of temperature. The 
results at D = 0.32 V/nm are shown in Fig. 3E. 
We observed a paramagnetic-like response 
curve with saturated RMCD signal at high 
magnetic field at base temperature, which 
disappeared within the applied magnetic field 
range at high temperature. We extracted the 
slope of the RMCD signal curve near u,H = 0 
(simplified as af —o) for use as a proxy of the 
magnetic susceptibility y, as in previous re- 
ports (41). We then plotted 1/slope versus tem- 
perature and fit the data to the Curie-Weiss 
law, x = Te (Fig. 3F). The fitted line inter- 
cepts the temperature axis at a negative val- 
ue, yielding a Curie-Weiss temperature 9¢ of 
about -3.5 K. This negative 8. demonstrates 
an antiferromagnetic interaction between local 
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moments in the moiré traps. This inferred anti- 
ferromagnetic interaction is consistent with the 
120° Néel order of a half-filled triangular lattice 
with strong onsite Coulomb interaction U. 

To explore the nature of the D field-induced 
magnetic phase transition, we performed RMCD 
measurements for D swept forward and back- 
ward at uf = 0. The data were taken in the 
slightly underdoped regime (v = -0.9) to avoid 
the complications owing to domain effects near 
v = -1 (fig. $7). As shown in Fig. 3G, there was 
no appreciable hysteresis between the curves 
near the phase transition boundary. Because 
hysteresis is a signature of a first-order phase 
transition, the lack of hysteresis implies that 
the D field-induced magnetic phase transition 
is second order. To further support this under- 
standing, we extracted the RMCD slope near 
u H = 0 at selected D values in the fully layer 
polarized state, in the antiferromagnetic inter- 
action regime. The extracted RMCD slopes 2 


OH? 
proportional to the magnetic susceptibility y, 
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interactions. All data were taken at v = -1, except for data in (G), which 

was taken at v = -0.9. (A and B) RMCD signal versus D and uH swept (A) down 
and (B) up. (C) Hysteretic component of the RMCD versus D, extracted from 
the difference between (A) and (B). (Inset) The blue dotted line indicates the 
phase space of the data taken in (A) to (C). Ferromagnetic states can be 
continuously tuned by electric field and eventually switched off once full layer 
polarization is achieved (a half-filled triangular lattice). (D) D field—tunable 
Curie temperature. (E) Temperature-dependent RMCD in the fully layer-polarized 
state at D = 0.32 V/nm. Data are offset for clarity. (F) Curie-Weiss fit 


rapidly increased as D approached the phase 
transition boundary. The singular behavior of 
— near the phase transition is consistent with 
the expectation of a second-order phase transition. 
In essence, what we have demonstrated is a dis- 
tinctive magnetoelectric effect of a correlated 
charge insulating state by tuning the moiré Ham- 
iltonian from a honeycomb to triangular lattice. 


Proximity control of the magnetic 
exchange interaction 


Last, we discuss the reemergence of ferromag- 
netic states with increasing hole doping above 
v = -1 at large D, as seen in the 2D phase di- 
agram in Fig. 1E. Using D = 0.26 V/nm asa 
representative large D value, RMCD mea- 
surements at selected v varying from -1 to -1.5 
are shown in Fig. 4A. As doping increased 
away from the v = -1 insulating state, RMCD 
hysteresis signal appeared, increased, and 
eventually vanished again. The residual RMCD 
signal versus v between u,/ swept down and 
up is plotted in Fig. 4B (raw data are available 
in fig. S8), which highlights the phase space of 
the revival of the ferromagnetic states. The 
ferromagnetic phase at D = 0.26V/nm spans a 


Anderson et al., Science 381, 325-330 (2023) 


temperature. 


(green) and 


range of v between about -1.1 and -1.4, which 
is a substantial shift with a reduced range com- 
pared with the D = O ferromagnetic phase 
centered around v = -1. The doping- and D- 
dependent magnetism of the system is very 
repeatable, appearing at different spots on the 
sample, as well as in other samples. The behav- 
ior is robust to small variations in local twist 
angle, which were ~0.2° over the sample area 
(fig. S9). 


Discussion and outlook 


The above results showcase another capability 
of the twisted MoTe, bilayer system: control- 
ling the magnetic order of the correlated insu- 
lating state in a triangular lattice by proximity 
to a moiré layer with tunable doping (Fig. 4, C 
and D). Starting at v = -1 at large D, the top 
layer is an insulator with antiferromagnetic 
interactions between local moments on the B 
sites of the moiré, and the bottom layer is an 
empty triangular lattice (Fig. 4C). As we in- 
creased the doping to v = -1+ &), with 2 < 1, as 
long as the onsite Coulomb interaction U was 
larger than the D field-induced charge transfer 
gap, the extra carrier x went to the first mini- 
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Negative Curie-Weiss temperature 8c implies antiferromagnetic 
interactions between local moments. (G@) RCMD (oH = 0) versus D swept down 
up (orange). Red dots are the extracted RMCD slope (2% 
as D approaches the ferromagnetic phase from the antiferromagnetic interaction side, 
showing singular behavior. (Insets) The spin arrangements with the antiferromag- 
netic and ferromagnetic interactions in the (top) triangular and (bottom) honeycomb 
lattices, respectively. The in-plane spin denotes the superposition of spin up in the 
valley and spin down in the -K valley, with their relative phase corresponding 

to the in-plane angle. Data in (A) to (C) and (G) were taken at T = 1.6 K. 


)y-0 


band of the lower layer, and the chemical poten- 
tial remained within the charge gap of the top 
layer (Fig. 4D). This realized a Mott insulating 
state in proximity to a doped moiré layer. 
Distinct from the proposed gate-tunable ex- 
change interactions of moiré Kondo lattices in 
heterobilayer geometry (42, 43), where the car- 
riers in the conducting layer are not trapped by 
the moiré potential, the doped layer in our case 
is a triangular lattice with moiré trapping po- 
tential presumably comparable with the other 
Mott insulator layer. The calculated charge dis- 
tribution under finite D field at integer filling 
(vw = -1) is compared with additional doping x 
[v = -11 + &)] in Fig. 4E. The extra carriers 
are centered at the B sites of the bottom layer. 
With this partial filling of the B sublattice, 
the ferromagnetic Coulomb exchange from the 
appreciable wave function overlap between 
the nearest-neighbor B and C sites starts to 
dominate over the antiferromagnetic kinetic 
exchange between the C sites in the same layer. 
Thus, ferromagnetic order develops in the 
bilayer system. Developing a full microscopic 
picture of the magnetic phase transitions upon 
application of D field and hole doping is a 
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Fig. 4. Proximity control of the magnetic exchange interaction in the 


Mott state. All data were taken at D = 0.26 V/nm and 
Versus [oH at selected filling factors. (B) Hysteretic c 
signal versus v, highlighting the ferromagnetic state in 


data were obtained by taking the difference of the RMCD signal between the 


magnetic field swept down and up (fig. S8). (Inset) The bl 


phase diagram corresponds to phase space where the data were taken. (C) (Left) 


Schematic of twisted bilayer MoTes at v = —1 with fully 


The top layer is a half-filled triangular lattice with antiferromagnetic interactions 


promising area for future study. One approach, 
explored in a recent theoretical work, is to 
construct effective spin Hamiltonians by using 
couplings extracted from experiment. These 
can then be used as a model to explore the 
evolution of the magnetic ground state upon 
doping of the system (44). We discuss this 
approach further in (37), section 2.2. 

An immediate opportunity to extend the re- 
sults presented here is to explore the rich mag- 
netic phase diagram in the honeycomb lattice, 
such as the predicted antiferromagnetic state 
at v = -2 (half-filled honeycomb lattice) and 
ferromagnetic state at v = -3 (filling of the 
second moiré flat band) (73). R-stacked MoTe. 
also hosts moiré electric polarization with op- 
posite dipole orientation in adjacent moiré 
orbitals. With this feature, combined with the 
direct band-gap properties and magnetic states, 
it is feasible to explore tunable moiré multifer- 
roicity with excitonic spectroscopy (19). Theory 
also predicts a variety of topological states, 
such as a QAH state with an electrically tun- 
able topological phase transition—an interesting 
direction for electrical transport measurements 
(13, 14, 18, 21). In addition, the moiré minibands 
are expected to become flatter as the twist angle 
is decreased. Bilayer MoTe, and WSe, twisted 
at an angle of about 1.4° has been predicted to 
host fractional QAH states (18, 27). Therefore, 
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T=1.6 K. (A) RMCD signal 
omponent of the RMCD 
the phase space of v. The 


ue dotted line in the 2D 


polarized layer pseudospin. 


it will be fascinating to engineer and explore 
magnetic interactions of correlated states at 
fractionally filled minibands as well as asso- 
ciated topological states in small-twist-angle 
bilayer MoTe,. By engineering multilayer MoTe., 
such as in a twisted monolayer-bilayer system, it 
may be possible to realize a platform for inves- 
tigating a gate-tunable moiré Kondo lattice (42). 
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Past interglacial climates with smaller ice sheets offer analogs for ice sheet response to future warming 
and contributions to sea level rise; however, well-dated geologic records from formerly ice-free areas 
are rare. Here we report that subglacial sediment from the Camp Century ice core preserves direct 
evidence that northwestern Greenland was ice free during the Marine Isotope Stage (MIS) 11 interglacial. 
Luminescence dating shows that sediment just beneath the ice sheet was deposited by flowing water 
in an ice-free environment 416 + 38 thousand years ago. Provenance analyses and cosmogenic nuclide data 
and calculations suggest the sediment was reworked from local materials and exposed at the surface 

<16 thousand years before deposition. Ice sheet modeling indicates that ice-free conditions at Camp Century 
require at least 1.4 meters of sea level equivalent contribution from the Greenland Ice Sheet. 


lobal oxygen isotopic composition of 
seawater (5'Oseawater) data (2) highlight 
several interglacial periods since ~1 mil- 
lion years ago (Ma) with global ice vol- 
ume lower than present, implying that 
the Greenland Ice Sheet (GrIS) and/or the 
Antarctic Ice Sheet were previously smaller 
than today. The lowest global ice volumes oc- 
curred during Marine Isotope Stage (MIS) 31 
(1.08 to 1.06 Ma) and MIS 11 [424 to 374 thou- 
sand years ago (ka)]. Moderately lower ice vol- 
umes occurred during MIS 9 (337 to 300 ka), 
and MIS 5 (130 to 71 ka). Global mean sea level 
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(GMSL) reconstructions indicate that the in- 
terglacials with the highest sea level were 
MIS 11 (+6 to 13 m) (2, 3) and MIS 5 (+1.2 to 
5.3 m) (4), consistent with substantial reduc- 
tion of the GrIS. However, the configuration of 
the GrIS, and thus its contribution to GMSL, 
during past interglacial periods remains poorly 
constrained (3). 

Marine sediment archives document fluc- 
tuations in GrIS extent since ~1 Ma. Ice-rafted 
debris (IRD) in marine sediment from the 
North Atlantic Ocean indicates variable but 
persistent marine-terminating glaciation in 
eastern Greenland (Fig. 1) (5). Increased '°Be 
concentrations in IRD during some inter- 
vals can be interpreted as a record of glacial 
erosion of landscapes exposed by smaller 
GrIS configurations during the Pleistocene (6). 
In the Labrador Sea, provenance changes in 
silt-sized sediment suggest that the south- 
ern GrIS retreated slightly during MIS 9 
and 5 and was almost entirely absent during 
MIS 11 (7-9). Pollen in Labrador Sea sediment 
indicates that southern Greenland was in- 
habited by boreal forests during MIS 11 and 
possibly MIS 13 (533 to 478 ka), by tundra 
during MIS 7 (243 to 191 ka), and by fern- 
dominated (Pteridophyta) ecosystems during 
MIS 5 and perhaps MIS 9 (10). Compared to 
the Holocene, sea surface temperatures in 
the Labrador Sea were much warmer during 
MIS 9, 7, and 5 for brief periods [12.2, 4.4, and 
9.7 thousand years (kyr), respectively] and 
slightly warmer during MIS 11 for a longer 
duration (20.7 kyr) (77, 12). Terrestrial temper- 
atures reconstructed from leaf-wax biomarkers 
show that several interglacials since 600 ka 
were warmer than the Holocene, including 
brief extreme warmth during MIS 5 and pro- 
longed moderate warmth during MIS 11 (73) 
(Fig. 1B). 


Age constraints of Greenland ice core E ce 
materials require substantial GrIS retrez——- 
least once since ~1 Ma (Fig. 1B). Silty ice from 
the bottom of the Greenland Ice Core Project 
(GRIP) ice core suggests glacial cover in cen- 
tral Greenland since 950 + 44 ka (°Be/*°Cl) 
(14) and 970 + 140 ka (8“°Ar/*8Ar) (15). Nearby, 
at the Greenland Ice Sheet Project 2 (GISP2) 
ice core, basal ice has 8*°Ar/?*Ar ages of 
>250 ka (16). Cosmogenic 7°Al/°Be data from 
the underlying subglacial bedrock at GISP2 
requires at least one episode of exposure, and 
thus ice-free conditions, in central Greenland 
after 1.1 + 0.1 Ma (/7). In southern Greenland, 
basal silty ice from the Dye-3 ice core contains 
DNA from boreal forest that occupied the area 
sometime between 400 and 800 ka (74). In the 
North Greenland Eemian Ice Drilling (NEEM) 
ice core, folded ice near the bottom of the ice 
core dates to MIS 5e (78), but the basal silty ice 
and underlying sediment have not been dated 
(79). In northwestern Greenland, subglacial 
sediment from the Camp Century ice core 
contains evidence for at least two ice-free 
events, one in the Early Pleistocene and an- 
other after 1 Ma (20). 

The Camp Century subglacial sediment 
(3.44-m-long, 0.1-m-diameter core) was re- 
covered in 1966 CE at the base of the ice core 
(1387 m depth) (27) and has yet to be exten- 
sively studied. The frozen sediment (Fig. 2A) 
contains an upper unit of bedded sand (0 to 
0.82 m below the ice-sediment interface) un- 
conformably overlying ice-rich graded mud, 
sand, and pebbles (0.82 to 1.12 m), an inter- 
mediate unit of vertically fractured sediment- 
laden ice (1.12 to 2.05 m), and a lower unit of 
diamicton containing subhorizontal ice lenses 
(2.05 to 3.44 m). An early investigation (22) 
that reported abundant freshwater diatoms 
and rare (likely windblown) marine diatoms 
in the Camp Century basal ice and subglacial 
sediment suggested that northwest Greenland 
was ice free and the GrIS retreated at some 
point during the Pleistocene. Recent analyses 
focused on the uppermost and lowermost sam- 
ples [sample 1059-4: 0 to 0.10 m (Fig. 2B); and 
sample 1063-7: 3.27 to 3.40 m (Fig. 2C)] (20). 
Enriched 8'°O values of pore ice from these 
samples suggest that precipitation fell at lower 
elevations and/or under warmer conditions, 
indicating local absence of the GrIS (20). Plant 
macrofossils and sedimentary leaf-wax bio- 
markers provide direct evidence for a tundra 
ecosystem during ice-free events (20). Clay 
mineralogy and major ions from pore ice sug- 
gest that the upper and lower sediment sam- 
ples have different weathering histories (20). 
?6.41/°Be data from the upper sediment require 
ice-free exposure at some time after 1.0 + 0.1 Ma; 
however, the precise timing of this ice-free 
event was not constrained (20). The lower sed- 
iment was buried at some point between the 
Early Pleistocene (>1.4 Ma; infrared stimulated 


1 of 6 


RESEARCH | RESEARCH ARTICLE 


-90°-60°-30° 0° 


80° 


Marine 
core 


ODP Fra 
e core 


Elevation /ce surface 


60° 


Labrador 

Sea 0 &213600 
Bedrock 
-5500 MEE 13200 

es 500 km 


(mas!) 


ODP 646 . \1D99-2227 
U1305 


Fig. 1. Overview map and paleoclimate. (A) Overview map of Greenland 
showing marine sediment cores and ice core locations, including Camp Century. 
Elevation data from (32). masl, meters above sea level. (B) Paleoclimate 

from 500 to 300 ka: Marine Isotope Stages (MIS), with light-pink shading 
corresponding to interglacial periods and darker-pink shading highlighting MIS 11; 
38 Oseawaten a proxy for global ice volume (1); Nordic Sea IRD flux (ODP 907) 
(5); southern Greenland sea surface temperature (SST; |ODP U1305) (11); 


luminescence) and Late Pliocene (<3.2 + 0.4 Ma; 
?6.A1/"Be) (20). 

Here, we present data that provide direct 
terrestrial evidence of ice sheet absence in 
northwestern Greenland during the MIS 11 
interglacial period. We present new lumines- 
cence dates of the upper subglacial sediment 
(1059-4) at the base of the Camp Century ice 
core. When combined with in situ cosmogenic 
?6a1 and '°Be measurements from the same 
sediment (20), the luminescence data allow 
us to constrain the timing of ice-free condi- 
tions and to model the maximum possible 
duration of surface exposure of the sediment. 
We use mineralogical and detrital geochrono- 
metric analyses (apatite U-Th/He; hornblende 
4° ar/?°Ar; and zircon, apatite, and rutile U-Pb) 
to characterize the provenance of the upper 
(1059-4) and lower (1063-7) subglacial sedi- 
ment. We use an ensemble of ice sheet models 
to simulate GrIS configurations that produce 
ice-free conditions at Camp Century and quan- 
tify the resulting sea level equivalent (SLE) 
contributions from the GrIS [see supplemen- 
tary materials (SM) for complete explanations 
of materials and methods]. 


Sediment deposition and 
paleo-exposure history 


Luminescence dating shows that the upper sub- 
glacial sediment from Camp Century was last 
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exposed to sunlight 416 + 38 ka (mean age + 
1o) during the MIS 11 interglacial period. We 
dated two sand-sized fractions of potassium 
feldspar recovered from an interior portion 
of sample 1059-4 cut in darkroom conditions 
while frozen (Fig. 2B; SM methods). Feld- 
spar separates were dated using post-infrared 
infrared-stimulated luminescence at 250°C fol- 
lowing multiple elevated temperatures (MET 
pIR IRSLos0) (23) (SM methods; Fig. 3, figs. S1 
to S4, and data S1 to S11). Anomalous fading 
(athermal loss of signal over time), which is 
common in feldspar, was corrected indepen- 
dently for each aliquot and pIR IRSL tempera- 
ture (24) (SM methods; data S3 to S5). Different 
grain sizes from the same subsample produced 
fading-corrected pIR-IRSLo;9 apparent ages 
of 459 + 34 ka (63 to 150 um; n = 22 aliquots; 
weighted mean + 1 SE) and 416 + 32 ka (150 
to 355 um; 2 = 20 aliquots) (data S1 and S4). 
The elevated temperature pretreatments ap- 
plied in the MET pIR IRSL can subsample 
residual signals not reset by light, necessitat- 
ing correction for their removal (SM methods; 
data S8) (25). The fading and residual-dose- 
corrected ages are 434 + 39 ka (63 to 150 um; 
n = 22 aliquots) and 395 + 36 ka (150 to 355 um; 
n = 20 aliquots) (Fig. 3 and data S1). Combining 
data from both grain-size fractions, the mean 
fading and residual-dose-corrected IRSL age 
indicates that the upper subglacial sediment 
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leaf-wax biomarker proxy (solid blue circles and light blue lines) for terrestrial 
summer temperature (ODP 646) (13); total counts of pollen (green) and 
spermatophyte spores of pteridophytes (purple) in Labrador Sea sediment (ODP 
646) (10); and Camp Century upper sediment (1059-4) fading and residual- 
dose-corrected pIRSL ages (weighted mean + 1 SE; larger red square with thick 
blue line shows the mean pooled age, and the smaller red squares with thin blue 
lines show the mean ages for each grain-size fraction). 


was last exposed to sunlight at 416 + 38 ka, 
during the MIS 11 interglacial (424 to 374 ka) 
(Fig. 1B and data S1). The IRSL ages record 
the deposition of the upper sediment after 
exposure to light in an ice-free environment. 
The coherent luminescence measurements in 
multiple aliquots (nm = 42 total) from two grain 
sizes demonstrate that the IRSL age is robust 
and is not the result of mixing exposed and 
unexposed grains in sample aliquots. Nota- 
bly, the absence of young ages indicates that 
the core interior was not exposed to light 
during drilling (1966 CE), sample storage 
(1966 to 2020 CE), or subsampling and pro- 
cessing (2020 CE). 

The newly determined luminescence age al- 
lows us to decay-correct the measured 7°Al/!°Be 
ratio (4.4 + 0.5) in quartz of the upper sedi- 
ment (20) to the ratio at the time of deposition. 
We can then model how long the sediment was 
exposed at the surface before deposition. If 
we assume that the upper sediment was ex- 
posed at the surface, deposited, and buried 
since 416 + 38 ka, then the 7°Al/!°Be ratio 
would have been 5.4 + 0.7 at burial (Fig. 4 and 
fig. S5). Our model indicates that all plausi- 
ble solutions require exposure durations of 
<16 kyr [assuming a Greenland-specific °°Al/"°Be 
surface production ratio of 7.3 (26); SM meth- 
ods]. This maximum 16 kyr exposure dura- 
tion assumes irradiation at the surface; if 
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Fig. 2. Camp Century subglacial sediment core log and sample photos (A) Camp Century subglacial 
sediment original core tube photos adapted from (44), photogrammetric three-dimensional (3D) model 
renderings from recent photographs, bulk density measurements, and core log updated from (20) with the 
locations of samples 1059-4 (bluish gray) and 1063-7 (reddish pink) highlighted. Sediment core segment 
photographs of samples (B) 1059-4 and (C) 1063-7, showing interior and exterior core faces and the location 


of the luminescence sample. 


nuclide production occurred at depth, then a 
greater maximum duration of exposure is 
possible. 

The luminescence age-informed 7°Al/’°Be 
burial history of the upper sediment is con- 
sistent with the 7°Al/Be burial history of the 
underlying lower Camp Century subglacial 
sediment (sample 1063-7). Previous IRSL anal- 
ysis of the lower sediment indicates that it was 
deposited >1.4 Ma (20), and the measured 
?6A1/!Be ratio (1.7 + 0.4) suggests that the 
lower sediment remained buried and was 
shielded from substantial nuclide production 
while the upper sediment was exposed to 
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cosmic rays and light during MIS 11. The 
?641/!°Be ratio of the lower sediment, decay- 
corrected for 432 kyr of burial (416 kyr lu- 
minescence age plus a maximum of 16 kyr 
surface exposure), is 2.1 + 1.2 (Fig. 4A and fig. 
S5). If the inherited 7°Al/"Be ratio (i.e., before 
MIS 11 exposure) of the upper sediment was 
equivalent to that of the lower sample (2.1 + 
1.2), then the surface-exposure duration of the 
upper sediment could be no more than 14 kyr 
(Fig. 4A and fig. S5; SM methods). This simi- 
larity suggests that the upper sediment was 
sourced from material with an integrated ex- 
posure and burial history comparable to that 
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of the lower sediment (Fig. 4B), which is con- 
sistent with the provenance data. 


Sediment provenance 


The upper and lower subglacial sediment 
samples at Camp Century have statistically 
indistinguishable provenance. Each of the 
detrital geochronometry measurements, in- 
cluding apatite (U-Th)/He dates, hornblende 
4° ar/°°Ar dates, and zircon and apatite con- 
cordant U-Pb dates (Fig. 5, figs. S6 to S8, and 
data S12 to S15) are similar and likely drawn 
from the same parent population, as shown by 
multiple statistical tests, including Kolmogorov- 
Smirnov, Kuiper, likeness, similarity, and cross- 
correlation coefficient (27) (data S16). Rutile 
U-Pb dates could only be determined for sam- 
ple 1059-4 (fig. S8 and data S17), but those data 
support an 1800 to 2000 Ma probability mode 
in the zircon U-Pb data reflecting a Paleopro- 
terozoic metamorphic imprint on Greenland | 
Archaean crust (28, 29). Likewise, the bulk 
mineralogy of the upper and lower sediment 
samples is similar and dominated by quartz 
and feldspar (Fig. 5E and data S18). The heavy 
mineral composition is mostly mafic silicates 
(amphiboles and pyroxene) and garnet, with 
lesser amounts of chlorite, iron oxides, apatite, 
and ilmenite (Fig. 5F). A Kolmogorov-Smirnov 
test demonstrates that the mineralogy of the 
upper and lower sediment is statistically simi- 
lar across multiple grain sizes and mineral 
fractions (data S19). However, the dominance 
of garnet over amphibole and pyroxene [g/(A+P)] 
in the upper sample (0.13) compared with the 
lower sample (0.04) may suggest more-intense 
weathering of heavy minerals in the upper 
sediment (Fig. 5F). These observations indi- 
cate that the upper sediment formed by re- 
working of local materials similar to the lower 
sediment, consistent with our modeling of the 
?6A1/'°Be ratio and luminescence data. 


Ice sheet modeling 


We used an ensemble of numerical ice sheet 
simulations to determine the minimum SLE 
contribution of the Greenland Ice Sheet required 
for Camp Century to be ice free (SM methods). 
The ensemble includes 96 different simulations 
varying the following parameters: rate of in- 
terglacial climate warming (19 1.335 1.66°, and 
2°C/kyr), starting ice sheet configuration 
[Last Glacial Maximum (LGM), modern with 
spin-up, modern with “cold” start], starting 
climatology (modern or Holocene Thermal 
Maximum), precipitation lapse rate, and as- 
thenosphere relaxation time. These parame- 
ters were chosen to incorporate unknowns in 
the interglacial MIS 11 climate and the glacial 
MIS 12 extent of the ice sheet to encompass a 
range of possible interglacial climate warming 
scenarios. Rates of interglacial warming were 
chosen to reflect the range of atmospheric 
warming in proxy records from around Greenland 
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Fig. 3. Luminescence ages of the upper sediment (1059-4). Dose-response curve (DRC) faded post- 
infrared analysis at multiple elevated temperatures (pIR-MET IRSL) and measured fading rates in potassium 
feldspar aliquots from the upper sediment (1059-4) in two different grain sizes: (A) 63 to 150 um and 

(B) 150 to 250 um. Reported age and gray shading show the results for the 250°C temperature step 


(pIR-MET 250) used for dating. 


(30, 31). For the analysis, we focus on the first 
time step in each simulation when the Camp 
Century site is ice free. We compiled the ice 
sheet extents of all simulations that produced 
ice-free conditions at Camp Century and the 
associated SLE contribution relative to the 
present ice sheet (32) (SM methods; Fig. 6A 
and fig. $9). 

The simulation with the least amount of ice 
loss needed to drive deglaciation at Camp 
Century equates to +1.4 m of SLE contribu- 
tion from the GrIS. Out of 96 hypothetical 
interglacials, spanning a range of uncertain- 
ties in climatological, glaciological, and solid- 
Earth parameters, 79 result in deglaciation at 
Camp Century (Fig. 6A and fig. S8; SM meth- 
ods). The ice sheet geometries correspond- 
ing to the first ice-free conditions at Camp 
Century equate to a range of sea level contri- 
butions (1.4 to 5.5 m SLE). The simulations 
that provide the low-end constraint (+1.4 m 
SLE) start with a modern ice sheet geometry 
and climate and use a 2% per °C precipitation 
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correction (Fig. 6, B and C). Starting with a 
glacial ice sheet configuration, using a Holo- 
cene Thermal Maximum climatology, and/or 
ignoring the precipitation correction increases 
this lower bound (SM methods). Modeling ice- 
free conditions at Camp Century thus provides 
a strong minimum constraint on the contri- 
bution of the GrIS to GMSL during MIS 11. 


Discussion 


The Camp Century subglacial sediment con- 
tains multiple lines of evidence for the de- 
glaciation of northwestern Greenland during 
the MIS 11 interglacial period (424 to 374 ka). 
The sorted, stratified upper sediment that 
contains abundant, well-preserved tundra 
plant macrofossils and biomarkers (20) was 
transported and deposited by flowing water in 
an ice-free environment. The coherent distri- 
bution of IRSL ages (416 + 38 ka in 42 aliquots) 
strongly suggests deposition of the upper 
sediment in an ice-free environment where 
the luminescence signal was effectively zeroed 
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Fig. 4. Luminescence age-informed modeling of 
?6a1/'°Be burial history for the upper (1059-4) 
and lower (1063-7) sediments. (A) Inherited 
26A\/Be ratios and (B) corresponding inherited 
total burial histories of the upper and lower sedi- 
ments, given a range of surface exposure duration 
of the upper sediment before deposition at 416 + 
38 ka, while the lower sediment remained buried 
during MIS 11. : 


by exposure to light at the land surface. Sta- 
tistically indistinguishable provenance and 
mineralogy data indicate that sediment in the 
uppermost sample was derived from the ero- 
sion of material similar in composition to that 
of the lower sample and that both samples 
originated from the weathering of the Pre- 
cambrian shield of northern Greenland (28). + 
The dominance of garnet over less-durable ‘ 
heavy minerals in the upper sediment sug- 
gests additional weathering in an ice-free 
environment (27). The durations we calculate . 
for interglacial exposure at MIS 11, using a 
combination of 7°Al/’°Be and luminescence 
data, are reasonable given other records of 
MIS 11 duration and are internally consistent 
at <16 kyr (33). 

The deglaciation of northwestern Greenland 
during MIS 11 confirms far-field paleoclimate 
records indicating the GrIS was smaller at this 
time (Fig. 1B). In Labrador Sea sediment cores, 
reduced input from specific Greenland bedrock 
terranes (7, 8) and elevated pollen concentra- 
tions [Ocean Drilling Program (ODP) Site 646] 
(10) suggest ice-free and forested conditions 
in southern Greenland during MIS 11. Although 
summer sea surface temperatures in the 
Labrador Sea (11, 12) and Greenland terres- 
trial summer temperatures (73) during MIS 11 
were less extreme than other interglacials of 
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Fig. 5. Detrital geochronometry and mineralogy from the upper (1059-4) 
and lower (1063-7) sediments. Kernel density functions of (A) apatite U-Th/He 
dates, (B) *°Ar/°Ar hornblende dates, and (C) 2°’Pb/*°Pb zircon dates. Thin 
lines show individual ages and 1o uncertainty, thick lines show kernel density 


function. (D) Tera-Wasserburg concordia intercept U-Pb dates for apatite. In (A) to 
(D), n refers to the number of dated mineral grains from each sample. (E) Bulk and 
(F) heavy mineralogy by percent area determined from automated quantitative 
mineralogy (data S18). 
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Fig. 6. Ice sheet modeling results that maintain an ice-free Camp Century. (A) Ice sheet extent from 
an ensemble of ice sheet simulations and associated SLE contributions such that Camp Century is ice 
free, with present ice extent (white) and ice-free areas (tan) shown for comparison. (B) Extent and ice 
thickness and (€) ice thickness change relative to modern ice sheet geometry (32) of the simulation with the 
least ice loss such that Camp Century deglaciates. Green circles mark the location of Camp Century. 
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the past 600 kyr (Fig. 1B), sustained terrestrial 
summer warmth caused greater retreat of the 
GrIS during MIS 11 (7, 13). The Labrador Sea 
proxy studies and ice sheet modeling inde- 
pendently indicate southern GrIS retreat 
for ~16 kyr before cooling and ice sheet 
expansion after 390 ka (7-11, 13), which is 
consistent (within uncertainties) with the 
timing and duration of ice-free conditions 
in northwest Greenland that we document. 
Some exposure-burial scenarios modeled from 
?61/°Be measurements in GISP2 subglacial 
bedrock in central Greenland (17) are com- 
patible with the deglaciation of southern and 
northwestern Greenland during MIS 11. How- 
ever, the persistence of eastern highlands- 
sourced IRD in the Nordic Sea (5) during MIS 11 
indicates that at least eastern Greenland re- 
mained glaciated. Our ice sheet modeling is 
consistent with these observations: Simula- 
tions that produce ice-free conditions at Camp 
Century also result in ice retreat in southern 
Greenland, but nearly all simulations main- 
tain ice cover in the eastern highlands to some 
degree (Fig. 6A). Isotopic analyses indicate 
that silt in both GRIP basal ice and GISP2 
subglacial till was sourced from the eastern 
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highlands (34); thus, age constraints of GRIP 
basal ice [950 + 44 ka (°Be/*°Cl) (14) or 970 + 
140 ka (8*°Ar/**Ar) (15)] further suggest ice 
cover in the eastern highlands, and possibly 
central Greenland, during MIS 11. 

The deglaciation of northwest Greenland, as 
demonstrated here, provides important geo- 
logic constraints on the GrIS contribution to 
the MIS 11 GMSL budget, which was +6 to 13 m 
higher than present (2). Previously, the esti- 
mated ~4.5 to 6 m SLE contribution from the 
GrIS during MIS 11 (2, 3) was largely deduced 
from the total GMSL budget and not based on 
specific ice sheet sources owing to the scarcity 
of direct geological constraints from Greenland 
or Antarctica (3). Our ice sheet modeling 
shows that the GrIS configuration with the 
least amount of ice loss needed for ice-free 
conditions at Camp Century produces +1.4 m 
of SLE from Greenland relative to the present 
GrlIS configuration. However, model-based esti- 
mates for GrIS contribution given ice-free con- 
ditions at Camp Century include a wide range 
of results, up to loss of the entire ice sheet and 
concomitant sea level contribution of ~7 m 
of SLE (Fig. 6 and fig. S8). Camp Century is 
located on a local ice dome in a cold area that 
receives abundant precipitation from Baffin 
Bay, making this sector of the GrIS resilient 
to warming. Regardless of how far the GrIS 
retreated into the interior, ice-free conditions 
at Camp Century explain =1.4 m of SLE con- 
tribution from Greenland to the +6 to 13 m 
GMSL budget of MIS 11 (2). 


Implications 


Our data show substantial retreat of the GrIS 
during the long, moderately warm MIS 11 in- 
terglacial, during which atmospheric carbon 
dioxide concentrations reached a maximum 
of 286 parts per million (ppm) (35). Northern 
Hemisphere summer insolation during MIS 11 
was not appreciably different from the pres- 
ent, but the duration of peak warmth of MIS 
ll was exceptionally long (29 kyr) because of 
the orbital configuration at the time (35-37). 
The extended duration of MIS 11 interglacial 
warmth resulted in large-scale retreat of the 
northwestern (this study) and southern GrIS 
(7, 10, 11, 13, 38). Going forward, the long at- 
mospheric residence time of anthropogenic 
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greenhouse gases will prolong current, human- 
induced climate warming for many thousands 
of years. Even under the intermediate Repre- 
sentative Concentration Pathway 4.5, in which 
atmospheric CO, concentrations begin to de- 
cline after 2040 CE, atmospheric CO, will take 
~30 kyr to return to 380 ppm (39), which is 
still ~100 ppm above the concentration reached 
during MIS 11 (35). If moderate warmth for 
29 kyr during MIS 11 resulted in substantial 
ice loss from Greenland, then rapid, prolonged, 
and considerable anthropogenic Arctic warm- 
ing (40) will likely cause melting of the GrIS, 
raise sea level, and trigger additional climate 
feedbacks in the coming centuries (41-43). 
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Genomic assessment of invasion dynamics of 
SARS-CoV-2 Omicron BA.1 
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Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern (VOCs) 

now arise in the context of heterogeneous human connectivity and population immunity. Through 
a large-scale phylodynamic analysis of 115,622 Omicron BA.1 genomes, we identified >6,000 
introductions of the antigenically distinct VOC into England and analyzed their local transmission 
and dispersal history. We find that six of the eight largest English Omicron lineages were already 
transmitting when Omicron was first reported in southern Africa (22 November 2021). Multiple 
datasets show that importation of Omicron continued despite subsequent restrictions on travel from 
southern Africa as a result of export from well-connected secondary locations. Initiation and 
dispersal of Omicron transmission lineages in England was a two-stage process that can be 
explained by models of the country’s human geography and hierarchical travel network. Our results 
enable a comparison of the processes that drive the invasion of Omicron and other VOCs across 


multiple spatial scales. 


ince the emergence of SARS-CoV-2 in 
late 2019, multiple variants of concern 
(VOCs) have sequentially dominated the 
pandemic worldwide. The Omicron VOC 
(Pango lineage B.1.1.529, later divided 
into lineages including BA.1 and BA.2) was 
discovered in late November 2021 through 
genomic surveillance in Botswana and South 
Africa and a traveler from South Africa in 
Hong Kong (J); it was designated a VOC by the 
World Health Organization on 26 November 
(2). An initial surge in Omicron cases in South 
Africa indicated a higher transmission rate than 
previous VOCs (3), which studies later attributed 
to a shorter serial interval, increased immune 
evasion, and greater intrinsic transmissibility 
(4-7). The mechanism for greater transmissibil- 
ity is hypothesized to be altered tropism and 
higher replication in the upper respiratory tract 
(8, 9). Together with waning levels of popula- 
tion immunity from previous infections and 
vaccination (JO), local transmission of Omicron 
BA.1 was reported soon thereafter in travel hubs 
worldwide, including New York City and London 
by early December 2021, despite travel restric- 
tions on international flights from multiple 
southern African countries (11, 12). 
Following the first confirmed case of Omicron 
in England on 27 November 2021 (13), Omicron 
prevalence increased rapidly across all regions 
of England, with Greater London prevalence 
peaking first in mid-December at ~6% followed 
by the South East region (/4). Other metropoli- 
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tan areas in North West and North East England 
saw similar but delayed increases in prevalence 
with observed peaks between early- and mid- 
January 2022. By January 2022, Omicron inci- 
dence had declined substantially in Greater 
London and other southern regions resulting 
in decreasing prevalence from north to south 
England (15). Rapid growth in infections during 
the initial emergence of Omicron in England 
prompted the UK government to impose in- 
terventions including a move to “Plan B” non- 
pharmaceutical restrictions (mandatory COVID 
pass for entry into certain venues, face coverings, 
and work-from-home guidance) on 8 December 
2021 (16), in addition to an accelerated program 
of booster vaccination for all adults by mid- 
December 2021 (17). SARS-CoV-2 prevalence in 
England decreased later in January 2022, co- 
incident with a falling proportion of BA.1 in- 
fections as lineage BA.2 became the dominant 
lineage; BA.2 was itself later replaced by line- 
ages BA.4 and BA.5 (18-20). 

Understanding and quantifying the relative 
contributions of the factors that determined 
the arrival and spatial dissemination of Omicron 
BA.1 in England can help inform the design of 
spatially targeted interventions against VOCs 
(21). We analyzed the Omicron BA.1 wave in 
England, using a dataset of 48,748 Omicron 
BA.1 genomes from England. This dataset rep- 
resents ~1% of all confirmed Omicron BA.1 
cases in England during the study period and 
is combined with aggregated and anonymized 


q 


human mobility and epidemiological Chee 
from lower tier local authorities (LTLA’, mos i 


England. 


International importation and Omicron BA.1 
lineage dynamics 


To investigate the timing of virus importations 
into England and the dynamics of the result- 
ing local transmission lineages, we undertook 
a large-scale phylodynamic analysis of 115,622 
SARS-CoV-2 Omicron genomes, sampled globa- 
lly between 8 November 2021 and 31 January 
2022. About 42% (n = 48,748) were sampled 
from England and sequenced by the COVID-19 
Genomics UK (COG-UK) consortium (22). All 
available genomes [from COG-UK and the Global 
Initiative on Sharing All Influenza Data (GISAID) 
(23) on 12 and 9 April 2022, respectively] sampled 
before 28 November 2021 were included; later 
genomes were subsampled randomly in propor- 
tion to weekly Omicron case incidence while 
maintaining a ~1:1 ratio between English and 
non-English samples. To reduce potential bias 
caused by heterogeneous sequencing cover- 
age, we performed a weighted subsampling of 
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the English genomes using a previously devel- 
oped procedure that accounts for variation in 
the number of sequences sampled per reported 
case at the upper tier local authority (UTLA) 
level (24) (supplementary materials). 

We identified at least 6455 [95% highest pos- 
terior density (HPD): 6184 to 6722] independent 
importation events. Most imports from outside 
of England [69.9% (95% HPD: 69.0 to 70.7)] led 
to singletons (i.e., a single genome sampled in 
England associated with an importation event, 
which did not lead to observable local transmis- 
sion in our dataset). The earliest importation is 
estimated between 5 and 18 November [approxi- 
mated as the midpoint between the inferred 
times of the most recent common ancestor 
(MRCA) of the transmission lineage and the par- 
ent of the MRCA (PMRCA)]. Between the first 
introduction and mid-December 2021, we recon- 
struct an approximately exponential increase in 
the daily number of imports, before a plateau in 
early January 2022 (Fig. 1C). Daily importation 
rate may have risen between 22 November (when 
Omicron was first reported) and 25 November 
(when travel restrictions started). Increased out- 
flows of air passengers before (and possibly 


2021-11-30 

2021-11-27 

Omicron BA.1 infection first 
detected in the UK (with travel 
history from South Africa) 


2021-11-21 

First Omicron BA.1 genome 
identified in England 
retrospectively 


2021-12-08 


2021-11-26 


New arrivals must isolate until negative PCR result 


2021-12-15 
All countries removed from UK's travel red list 


“Plan B” announced (working from 
home guidance and compulsory face 
masks in some settings) 


in anticipation of) the imposition of travel re- 
strictions have been reported for SARS-CoV-2 
elsewhere (25, 26). The importation rate ap- 
pears to re-accelerate early in December, de- 
spite restrictions on incoming international 
travel from 11 southern African countries; im- 
ports then could have originated from BA.1 
outbreaks in other countries in late November 
and early December 2021. 

To explore this hypothesis, we calculate the 
estimated importation intensity (EII) of Omicron 
BA.1 from countries with the highest air traffic 
volumes to England, capturing 80% of incoming 
passengers. For each source location, the EII 
combines the weekly average COVID-19 test 
positivity rate, weekly relative prevalence of 
Omicron BA.1 genomes, and monthly number 
of observed air passengers traveling to England 
and thus represents a relative rate of importa- 
tion (details and sensitivity analyses are avail- 
able in supplementary materials; figs. S4 to S6). 
Although the earliest imports were inferred 
to have come mostly from South Africa, we ob- 
serve a diversification in the inferred sources of 
BA.1 imports by late November and early 
December 2021 (Fig. 2A), during the period of 


2022-01-22 

Omicron BA.2 marked 
by UKHSA as “Variant 
Under Investigation” 
(VUI) 


Christmas 


Daily frequency of 
inferred importations 


2022-01-07 2022-01-27 

Pre-departure tests no “Plan B" lifted 
longer required for fully and return to 

vaccinated inbound “Plan A” 


travellers 


All flights suspended from 6 southern African 
countries (South Africa, Namibia, Lesotho, Eswatini, Zimbabwe and Botswana) 


travel restrictions (mandatory hotel quaran- 
tine) (27) on international travel from South 
Africa. We conclude that the exponential growth 
of BA.1 importations through mid-December is 
in part due to introductions from countries 
other than South Africa (Fig. 1B and Fig. 2), as 
a result of their growing Omicron epidemics 
and substantial air travel volumes to England 
(fig. S4). When travel restrictions for 11 southern 
African countries were first announced (Fig. 1A), 
BA.1 genome sequences from only four countries 
globally had been uploaded to GISAID (23). 
We note that our work is not designed to quan- 
titatively assess the impact of travel restrictions 
on infection numbers in England. 

To cross-validate the importation dynamics 
inferred from viral genomes and Es using in- 
dependent data, we collated the travel history 
of inbound travelers who later tested positive 
for BA.1 following their arrivals (data gener- 
ated by the UK Health Security Agency; sup- 
plementary materials). The early temporal 
profile of importation from these data are con- 
sistent with that inferred from both the EIIs and 
the phylodynamic analysis (until mid-December; 
Fig. 2A), with the growth of the latter being 
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Fig. 1. Dynamics of BA.1 transmission lineages in England. (A) Timeline of 
events during the BA.1 wave in England until February 2022. (B) Histogram 

of estimated daily number of BA.1 cases, colored according to the proportion of 
cases attributable to transmission lineages imported at different times (shaded 
region shows period of travel restrictions). Curves show the estimated daily 
frequency of importation (7-day rolling average), colored according to the size of 
resulting local transmission lineages; shading denotes the associated 95% HPD. 
For each of the eight largest detected transmission lineages (A to H), the 
estimated time of importation, TMRCA (inferred time of most recent common 
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ancestor) and TPMRCA (inferred time of parent of MRCA) (bottom left of the 
panel). (C) Daily frequency of importation (7-day rolling average; black dots) 
estimated from phylodynamic analysis, without stratification by size of resulting 
local transmission lineage; error bars denote the associated 95% HPD. Solid 
blue line represents an exponential model fitted to the observed 7-day rolling 
average values. (D) Distribution of TPMRCAs and TMRCAs of all 6455 detected 
introductions. Each horizontal line represents a single introduction event that led 
to a transmission lineage or singleton; the left limit indicates the TPMRCA and 
the right limit indicates the TMRCA (or genome sample date, for a singleton). 
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Fig. 2. Dynamics of Omicron BA.1 importation into England. (A) Solid curve 
represents the aggregated Ell for 27 countries with the highest air passenger 
volumes to England between November 2021 and January 2022 (collectively 
comprising ~80% of air passengers in this period). Colored bars show the weekly 
number of inbound travelers who tested positive for BA.1 following arrival in the 
UK, extracted from travel data compiled by the UKHSA; segments are colored 
according to country of origin. Gray bars show the estimated daily number 

of importation events from phylodynamic analyses. Inset shows a magnified view 


of early trends. Shaded region indicates the period of travel restrictions on travel 
from southern African countries. (B) Estimated weekly number of Omicron BA.1 
cases arriving in England from 27 countries with the highest air passenger 
volumes to England between November 2021 and January 2022 [same as those 
in (A)]. Thick solid lines represent weekly Ell from eight countries that contribute 
substantially to overall Ell at different times; thin gray lines represent other 
countries. Inset shows a magnified view of early trends. Shaded region indicates 
the period of travel restrictions. 


slightly lagged (Fig. 2A). This observation is 
consistent with previous studies and is likely 
due to the time lag between international im- 
portation and the first local transmission event 
observable from genomic data (28). The relative 
frequency of genomically identified BA.1 imports 
among travelers from South Africa and Nigeria 
declined in mid-December as importation from 
other countries began to dominate, consistent 
with the EI results. Observed imports from the 
phylodynamic analysis also declined in January, 
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likely due to right censoring (the last genome 
in our dataset was sampled on 31 January). 
As with the emergence of previous VOCs in 
England (28, 29), we find that transmission 
lineage sizes are overdispersed (fig. $2), with 
most sampled genomes belonging to a few large 
transmission lineages. The eight largest lineages 
(>700 genomes each) together comprise >60% 
of the English genomes in our dataset (Fig. 1B). 
We infer that six of these eight were imported 
before restrictions on travel from southern 


African countries were introduced (26 November), 
and three could have been introduced before 
the first epidemiological signal of Omicron [a 
change in S-gene target failure (SGTF), samples 
identified by a private lab in South Africa on 
15 November; Fig. 1B]. Although aggregation 
of lineages as a result of unsampled genetic 
diversity outside England could have resulted 
in earlier importation estimates (30), this is un- 
likely given the enrichment of early genomes 
and consistency of the observed lineage size 
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Fig. 3. Spatiotemporal dynamics of BA.1 transmission lineages in England. 
(A and C) Continuous phylogeographic reconstruction of the dispersal history of 
Transmission Lineage-A, the largest detected BA.1 transmission lineage. Nodes 
are colored according to inferred date of occurrence and edge curvature 
(anticlockwise) represents the direction of viral lineage movement. (A) shows the 
progress of dissemination at three specific times whereas (C) shows the 
complete construction. (B) Geographical distribution of the inflow and outflow of 
viral lineages within Transmission Lineage-A, from 1 December to 25 December 
2021. Blue colors indicate areas with high intensity of viral lineage outflow; red 
colors indicate those with high intensity of inflow. Red circles indicate areas with 
high densities of local viral movements (distances <15 km); circle radii are 
proportional to that density. (D) Continuous phylogeographic reconstructions of 
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Nov 25 


Dec 09 Dec 23 Jan 06 


Transmission Lineages-C, E, and G [as per panel (C)] with corresponding 
geographical distributions of viral lineage inflow and outflow [as per panel (B)]. 
Fig. S12 provides equivalent figures for Transmission Lineages-B, D, F, and H. 
(E) Plots in each row show viral lineage movements across different spatial 
scales [(top) <50 km; (middle) 50 to 300 km; (bottom) >300 km). (Left) 
Histograms showing the daily frequency of viral lineage movements; colors 
indicate whether the origin and/or destination of inferred lineage movements 
occurred in Greater London. (Middle/Right) Solid black lines represent the daily 
frequency of among-region viral lineage movements. Vertical bars indicate the 
proportions of viral lineage movements (aggregated at 2-day intervals); 

colors indicate origin/destination locations. Shaded gray areas indicate periods 
when there were <9 inferred viral lineage movements per day. 
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Fig. 4. Predictors of BA.1 viral lineage movements in England. (A) Map at 
LTLA level of model predictors included in the phylogeographic GLM analysis for 
Transmission Lineage-A. (B) For each predictor, the box and whiskers show the 
posterior distribution of the product of the log predictor coefficient and the 
predictor inclusion probability; the left- and right-hand values show the estimates 
for before and after 26 December, respectively. Top and bottom panels show 
estimates for Transmission Lineage-A and -B, respectively. Posterior distributions 
are colored according to predictor type: geographic distances (geo distance, 
dark blue), population sizes at origin and destination (pop size ori/dest, black), 


distribution with that from the simulation 
(figs. S7 and S8). We observe a strong asso- 
ciation between the size and time of import- 
ation of local transmission lineages, with most 
large transmission lineages attributed to early 
introductions (before mid-November) (Fig. 1B). 
This pattern is recapitulated by a simple math- 
ematical model; if all lineages share the same 
transmission characteristics, then the date of 
importation is the main determinant of trans- 
mission lineage size when the epidemic in the 
recipient location is growing exponentially (sup- 
plementary materials; figs. S7 and S8). 

We estimate that ~400 transmission line- 
ages (including the eight largest) resulted from 
importation before the end of travel restric- 
tions on 15 December (29 lineages were intro- 
duced before 26 November). Although these 
early imports account for only a small pro- 
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portion (~6%) of the estimated number of 
introductions, they are responsible collectively 
for ~80% of estimated BA.1 infections in 
England by the end of January 2022. 


Human mobility drives spatial expansion and 
heterogeneity in Omicron BA.1 growth 


The rapid increase in Omicron importation in 
late 2021 led to the establishment of local 
transmission chains, initially concentrated in 
Greater London and neighboring LTLAs in the 
South West and East of England. This coincided 
with early increases in BA.1 prevalence in those 
regions, as observed from SGTF data and epi- 
demiological prevalence surveys (15). To inves- 
tigate further the spatiotemporal dynamics of 
BA.1in England, we reconstructed the dispersal 
history of all identified transmission lineages 
(with >4 genomes) using spatially explicit phy- 


aggregated mobility matrix (mobility mat, purple), mobility-based community 
membership level 1 and level 2 (comm overlap !1 and I2, purple), Greater London 
origin and destination (gr LDN ori/dest, red), time of peak incidence at origi 
and destination (peak time ori/dest, orange), the residual of a regression of 
sample size against case count regression at origin or destination (sample res 
ori/dest, yellow). Boxes at the bottom of each pane 
to show the ranking of predictors based on their deviance measure (materials 
and methods), with 1 indicating the largest deviance (most important predictor) 
and 12 indicating the smallest (least important predictor). 


n 


are numbered and shaded 


logeographic techniques. Genomic sample sizes 
were highly representative of the estimated num- 
ber of BA.1 cases at the UTLA level in England 
(figs. S9 and S10). 

We observe distinct stages in the spread of 
BA.1 across England, with the eight largest 
transmission lineages sharing broadly similar 
patterns of spatial dispersal. Unlike other VOCs, 
the first detected BA.1 transmission lineages are 
more evenly distributed among regions, with 
~20% in Greater London, ~15% in the South East, 
and 13% in the North West (if only introductions 
before December 2021 are considered, the val- 
ue for Greater London is 27%). However, most 
early cases outside Greater London resulted 
in limited local spatial diffusion (Fig. 3 and 
figs. S11 and S12). 

Initial long-distance viral lineage movements 
from Greater London repeatedly arrived in 
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multiple urban [as classified in (30)] conur- 
bations in early and mid-December 2021, but 
local transmission was not established imme- 
diately. The fraction of viral lineage move- 
ments that were local (within-city) remained 
between 25 and 50% from December 2021 to 
January 2022 in all areas except Greater London 
(~90%) and Greater Manchester (~60%). This 
fraction grew when local mobility levels re- 
covered after the holiday period (37-34), coincid- 
ing with the establishment of local transmission 
across most LTLAs in England (fig. S11). Further, 
cities other than Greater London acted primarily 
as sinks throughout the BA.1 wave, with limited 
backflow of long-distance viral lineages from 
North West England to Greater London (e.g., 
Transmission Lineage-A and -B; similar dynam- 
ics are seen also for South West England; Fig. 3E). 
We define locations as either sinks or sources 
according to whether there was a net flow of 
viral lineages into or out of the location, res- 
pectively, over the study period. 

Even after the establishment of local trans- 
mission in most English LTLAs, Greater London 
continued to be a source of mid-to-long range 
viral lineage movements (Fig. 3E). This is ex- 
pected given Greater London’s role as a major 
hub in England’s mobility network (similar 
trends were observed for the Alpha wave in 
2020) (26). The importance of Greater London 
as a source of short range (<50 km) lineage 
movements declined through time (Fig. 3E, 
top left) and we observe a secondary peak in 
the frequency of mid-to-long range movements 
(>50 km) driven predominantly by lineages 
emanating from the Midlands and southern 
England (Fig. 3E, middle and right). These ob- 
servations are consistent with epidemiological 
data showing that most areas outside of southern 
England experienced a BA.1 incidence peak 
only in the last week of December 2021 or the 
first week of January 2022 (fig. S13). 

To assess the contribution of demographic, 
epidemiological, and mobility-related factors 
to the dissemination of BA.1 in England, we 
used a phylogeographic generalized linear model 
(GLM) to test the association of those factors with 
viral lineage movements among LTLAs, during 
two distinct periods (before 26 December 2021, 
and between 26 December 2021 and 31 January 
2022; supplementary materials) (32, 33, 35). 
Using this time-inhomogeneous model we find 
evidence for a dynamic spatial transmission 
process, with the estimated effect size and 
relative importance of most predictors varying 
over time (Fig. 4B; ranking of predictors based 
on their deviance measure are shown in boxes). 
During the earlier “expansion” period of lineage 
dissemination, we observe strong support for 
the gravity model predictors (a spatial interaction 
model in which travel intensity between pairs of 
locations increases with origin and destination 
population sizes but decreases with distance). 
Consistent with results from continuous phylo- 
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geography (Fig. 3), this early period is charac- 
terized by directional viral dissemination; lineage 
movements tend to originate from Greater 
London (Fig. 4B) and this is particularly pro- 
nounced for smaller transmission lineages 
(Fig. 3 and fig. S12). For LTLAs with earlier 
times of peak incidence, we also find greater 
outflow of virus lineages during the expansion 
period (in three of four analyses) and a lower 
inflow of viral lineages during the post-expansion 
period (in four of four analyses; Fig. 4 and fig. 
S14). These results reflect the network-driven 
nature of Omicron’s geographic spread, with 
variation in the timing of peak incidence 
reflecting varying degrees of connection to lo- 
cations where frequent importation seeded 
early transmission chains (36). 

The human mobility predictor is supported 
consistently only in the post-expansion phase 
(Fig. 4B), after local transmission has been es- 
tablished in most LTLAs. This reflects a transi- 
tion from unidirectional long-distance movements 
to more homogeneous local dissemination. Con- 
versely, support for the gravity model predic- 
tors decreased over time (Fig. 4B), consistent 
with the notion that the gravity model better 
predicts city-to-city movement and poorly de- 
scribes diffusion-like mobility over short distances 
in urban areas (37). Importantly, the phylogeo- 
graphic GLM results are consistent among the 
transmission lineages analyzed (Fig. 4B), and 
when a simpler time-homogenous model is 
used (fig. S15). These findings corroborate our 
continuous phylogeography analyses (Fig. 3) 
and epidemiological studies showing strong 
local spatial structure of the BA.1 wave (14, 15). 
We also explored whether booster vaccine uptake 
(per capita at the LTLA level) is supported as a 
predictor under a time-inhomogeneous model, 
but found no significant support (supplemen- 
tary materials), possibly a result of collinearity 
of this factor with other predictors or limited 
spatial heterogeneity in vaccine uptake. 


Discussion 


We find that most infections during the Omicron 
BA.1 wave in England can be traced back to a 
small number of introductions, which likely 
arrived before or during travel restrictions on 
incoming passengers from southern Africa. 
Although the rate of importation continued 
to increase after mid-December (Fig. 1C), the 
largest English transmission lineages tended 
to be those introduced earlier (Fig. 1D). These 
results augment previous investigations of VOCs 
in England and elsewhere (28, 38), highlighting 
that international travel restrictions can have 
limited impacts if applied after local exponen- 
tial growth is established and in the absence of 
local control measures. Our analyses indicate 
that epidemics of BA.1 in multiple locations 
outside the country where BA.1 was first detected 
contributed substantially to the growth of BA.1 
importation into England in December 2021 


(39). The impact of targeted travel restrictions 
may thus be constrained by the existence of 
multiple pathways between any two countries in 
the global aviation network, and such pathways 
often traverse highly connected locations with 
large travel volumes that can act as secondary 
sources of early importation (36). UK travel 
restrictions were intended to delay the expansion 
of BA.1 locally while offering additional vacci- 
nations to at-risk individuals. However, Omicron 
had likely already spread internationally by the 
time it was detected in late November 2021, 
allowing the establishment of secondary loca- 
tions of exportation (39, 40). Therefore, any 
proposed global systems that aim to rapidly 
detect and respond to new VOCs (and emerging 
infectious diseases in general) should be de- 
signed around the connection structure of 
human mobility networks. Despite this, there 
are likely to be scenarios under which travel 


restrictions can help control, contain, or delay _ 


the spread of emerging infections (41, 42); con- 
siderable additional theoretical and empirical 
work is needed to improve and inform rapid 
decision-making regarding travel during public 
health emergencies. 

Our two phylogeographic analyses (Figs. 3 
and 4) jointly show how Omicron BA.1 dissemi- 
nated rapidly across England, with Greater 
London central to its initial dissemination. 
Early viral movements outside of Greater London 
were dominated by medium-to-long-distance 
travel from there; local transmission in reci- 
pient locations was observed later, coinciding 
with an increase in human mobility after the 
winter holidays (fig. S18). The epidemic is re- 
vealed to be a network-driven phenomenon 
with an initial expansion phase that is well 
described by a gravity model, followed by a 
period of sustained local transmission propa- 
gated by short-distance movement (36). 

With this study, we can now compare the 
transmission histories of three VOC waves in 
England [Alpha (26), Delta (29), and Omicron] 
and contrast factors that influenced their dis- 
persals. First, Omicron and Delta were intro- 
duced through international importation, whereas 
Alpha appeared to have originated in England 
(43). For both Omicron and Delta, early intro- 
ductions from their presumed location of ori- 
gin were followed by growth in importation 
intensity from secondary locations. While early 
Delta transmission clusters were observed 
mainly in North West England, early Omicron 
infections were found mostly in Greater 
London (J5, 78). Second, different NPIs and 
restrictions on within-country travel were im- 
plemented during the VOC waves. Although 
Delta arrived when NPIs in England were 
being relaxed, its initial spread was delayed 
because of lower mobility levels following a 
national lockdown (29). By contrast, Omicron 
was introduced when mobility had largely 
recovered to prepandemic levels (fig. S18). 
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Alpha was observed to rapidly expand from 
its proposed origin in southeast England— 
likely attributable in part to holiday travels 
(26)—and was subsequently brought under 
control when local mobility decreased after 
the introduction of NPIs (26). Third, the dissem- 
ination of each VOC is likely to be differen- 
tially affected by spatial variation in population 
immunity. Such variation was likely limited 
during Delta’s emergence as a result of high 
population levels of vaccination and previous 
infection and also during Omicron’s emergence 
due to the antigenic novelty of BA.1 (9, 44, 45). 
By contrast, initial growth rates of Alpha in 
England were found to be affected by local 
variation in previous attack rates (26). These 
findings highlight two key questions for future 
work: how do spatiotemporal interactions be- 
tween importation and local transmission 
shape the spread of a VOC, and how can we 
efficiently evaluate the interplay of factors that 
drive the dissemination of new VOCs within a 
country? 

We interpret our phylodynamic results in 
the context of several limitations. First, as dis- 
cussed previously (28), the inferred number of 
importation events underestimates the true 
number of independent introductions due to 
incomplete sampling and uneven sequencing 
coverage worldwide (46). Nevertheless, we were 
able to cross-validate our phylodynamic results 
using independent epidemiological data (figs. 
S7 and S8). Second, to maintain computational 
tractability and remove potential sampling bias, 
we subsampled all available English Omicron 
genomes, accounting for geographical variations 
in sequencing coverage and prevalence. How- 
ever, even after this subsampling, the spatial 
and temporal sampling was not perfectly rep- 
resentational (Fig. 4A and fig. $9). This could 
be due to spatial variation in case reporting 
rate or because the maximum sequencing capa- 
city was exceeded in locations with high inci- 
dence. Third, our phylogeographic GLM analysis, 
which explores the association of factors with 
virus lineage movement, should be interpreted 
in light of potential biases in the mobility data. 
For example, mobility in sparsely populated 
locations may be poorly captured as a result of 
censoring to protect user anonymity, and the 
degree to which smartphone data are repre- 
sentative of the whole population is affected 
by variation in smartphone use among locations. 
Work is ongoing to assess how human mobility 
data can be best applied to the prediction and 
description of infectious disease invasion dyna- 
mics (47, 48). 

Omicron BA.1 was replaced by lineage BA.2 
in February 2022 and later by lineage BA.5 in 
June 2022 (78, 19). Although the public health 
emergency of international concern has ended 
(49) and the public health burden of COVID-19 
has lessened as a result of reduced average 
disease severity and increased population immu- 
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nity, the continued antigenic evolution of SARS- 
CoV-2 means that future VOCs of unknown 
virulence remain possible. One priority in pre- 
paring for the next VOC or novel pathogen emer- 
gence is to develop and implement robust 
pipelines for large-scale genomic and epidemi- 
ological analyses supported by unified data in- 
frastructures (50, 51) a challenging task that 
will be realized only through close coordina- 
tion of public health efforts worldwide. 
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Emergent coexistence in multispecies 


microbial communities 
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Understanding the mechanisms that maintain microbial biodiversity is a critical aspiration in ecology. Past 
work on microbial coexistence has largely focused on species pairs, but it is unclear whether pairwise 
coexistence in isolation is required for coexistence in a multispecies community. To address this question, 
we conducted hundreds of pairwise competition experiments among the stably coexisting members of 
12 different enrichment communities in vitro. To determine the outcomes of these experiments, we 
developed an automated image analysis pipeline to quantify species abundances. We found that 
competitive exclusion was the most common outcome, and it was strongly hierarchical and transitive. 
Because many species that coexist within a stable multispecies community fail to coexist in pairwise 
co-culture under identical conditions, we concluded that multispecies coexistence is an emergent 
phenomenon. This work highlights the importance of community context for understanding the origins of 


coexistence in complex ecosystems. 


xplaining species coexistence and the be- 
wildering diversity of ecological com- 
munities is a major goal of ecology (1). 
Historically, this problem has been in- 
vestigated through the lens of species 
interactions and population dynamics. This 
work has played a central role in theoretical 
ecology (2, 3), establishing, for example, the 
importance of competitive interactions for 
community stability (4, 5) and the criteria re- 
quired for stable coexistence in species pairs 
and pairwise networks (6, 7). An important 
caveat is that the ability of any model to fully 
capture the population dynamics of empirical 
populations is limited, and interactions be- 
tween species are often modulated by envi- 
ronmental context (8, 9) and by the presence 
of additional species (JO, 11). As a result, in 
recent years, research has started to shift to 
directly study coexistence networks (12-14). 
A central question is whether the known out- 
come of competition between pairs of species, 
i.e., their coexistence or competitive exclusion, 
can be leveraged to predict the composition of 
complex communities and the paths leading 
to their assembly (12, 13, 15). If this approach 
were fruitful, then it would circumvent the 
need to know the full mathematical structure 
of population dynamics models to predict com- 
munity assembly (74). 
There are two opposing views on species 
coexistence (Fig. 1A). A reductionist perspec- 
tive is that multispecies coexistence is an ad- 
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ditive affair, and all of the coexisting members 
of a community must also coexist as pairs 
when isolated from the community context 
(7/4). An alternative view is that coexistence 
in a multispecies community is a more com- 
plex, or emergent, property of the community, 
which is not exhibited by its most elementary 
units of coexistence, pairs of species in isola- 
tion (16). Which of these two views best reflects 
the reality of empirical communities (Fig. 1A)? 
Determining which view is more accurate re- 
quires deconstructing a community into spe- 
cies pairs to determine whether all possible 
combinations can coexist. If most can, then 
the reductionist view is supported. If few can, 
then coexistence is an emergent property of 
the community, as is seen in nontransitive 
competition [i.e., as in the rock, paper, scissors 
game (17-24)], which may allow multiple spe- 
cies to coexist even when none of them do as a 
pair in isolation (17-24). 

Resolving this question is essential in mi- 
crobial ecology given the enormous and still 
largely unexplained diversity of microbial eco- 
systems (13, 25). Directly testing the two 
hypothesized scenarios described above is 
generally not feasible in natural microbial 
communities because of their diversity. Even if 
we managed to isolate most community mem- 
bers from a given habitat, the number of rep- 
licate environments that we would need to 
recreate to culture every single pair would 
scale quadratically with the community rich- 
ness. Recent studies have taken a synthetic 
approach by reconstituting species pairs from 
natural communities in well-controlled lab- 
oratory environments (14, 26, 27). Although 
these studies found support for the reduc- 
tionist hypothesis, their limitation lies in the 
small fraction of coexisting species that could 
be isolated and the differences between the 
laboratory environment and the original com- 
munity habitat. 


: oa : up 
hypotheses in an empirical system that is 
suited for this purpose. Our starting point was 
a collection of bacterial enrichment commu- 
nities that we have recently assembled in well- 
controlled synthetic environments containing 
glucose as the single externally supplied limit- 
ing nutrient (Fig. 1B) (9, 28-31). These com- 
munities formed in a manner that is similar to 
the “random zoo” model in theoretical ecology 
(32). In brief, 12 soil and plant microbiomes 
were resuspended in separate test tubes con- 
taining M9 minimal medium (9) (Fig. 1B). This 
provided us with a diverse pool of bacterial 
species containing between 110 and 1290 exact 
sequence variants (ESVs) (fig. S1) (9). These 
12 initial microbiota solutions were then in- 
oculated by a 125-fold dilution into separate 
bioreactors containing M9-glucose growth 
medium (see the materials and methods), in- 
cubated for 48 hours under static conditions 
at 30°C, and then serially passaged 12 times 
each (~84 bacterial generations under our 
conditions) (Fig. 1B) (9). Community composi- 
tion at various time points was determined by 
16S ribosomal RNA (rRNA) amplicon sequenc- 
ing. All communities contained multiple (V < 
25) coexisting ESVs belonging primarily to the 
families Enterobacteriaceae and Pseudomona- 
daceae (Fig. 1B) (9, 28, 30, 33). It was thus pos- 
sible in this system to deconstruct multiple 
stable communities and reconstitute and com- 
pete most pairs of species under the same 
starting conditions. This experimental system 
allowed us to evaluate whether all pairs of 
organisms that coexist as a part of a multispe- 
cies community also coexist in isolation (29), 
thus directly testing whether coexistence is a 
pairwise or an emergent phenomenon. 


RESULTS 
Coexistence is stable in our 
enrichment communities 


To establish whether coexistence in these mul- 
tispecies enrichment communities is stable, 
we set out to analyze the published commu- 
nity assembly dynamics data from previous 
studies (9, 34, 35) in which the frequencies 
(x;) of all ESVs were quantified at the end of 
each transfer (2) for 26 representative com- 
munities (fig. S2). We determined the inva- 
sion fitness [F = log(a;/x;_)] of every ESV in 
these communities over their full assembly 
dynamics (7 = 2,3, ..., 12; see the materials 
and methods) and found that a large majority 
of the ESVs found at the end of the experiment 
exhibited hallmarks of negative frequency- 
dependent selection. For 95/99 of these ESVs, 
the dependence between fitness and frequen- 
cy was best fit by a negative regression slope 
(fig. S3), and the equilibrium frequency (2*) 
predicted from this linear regression model 
[the frequency for which F(a*) = 0] agreed very 
well with the empirically observed equilibrium 
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Fig. 1. Enrichment microbial communities allowed us to test the complexity 
of species coexistence. (A) The two hypotheses about species coexistence 
tested in our study. (B) To discriminate between the two hypotheses, we used an 
empirical system constructed from previously assembled enrichment in vitro 
bacterial communities under serial growth and dilution cycles (9). In inset |, 


were quantified as the average frequency of an ESV in the last four transfers 
of the community assembly process (transfers nine to 12). To determine the 
predicted equilibrium frequency x* (x axis), we first quantified the invasion 
fitness F; = log (x;/x;-1) for each ESV at each transfer and then regressed this 
F; against ESV frequency. This regression yielded a negative slope for 95/99 


we present the full assembly dynamics for a representative community, 
showing the frequency of each ESV at the end of every growth period 
(transfers). We only show ESVs >2% in frequency, each in a different color. 
We chose 12 representative communities with richness ranging between 

N =5 and N = 13 ESVs at transfer 12 (inset II) and isolated most community 
members (colored bars) covering an average of 89.4% of the abundance. 
Gray bars represent ESVs that we were not able to isolate (see the materials 
and methods). Raw data were obtained from previous studies (9, 34, 35). 
(C) Frequency-dependent dynamics predicted the empirically observed 
equilibrium frequencies. Empirical equilibrium frequencies (horizontal axis) 


frequencies, which we determined as the av- 
erage frequency of the ESV over the last four 
transfers (Fig. 1C, fig. S3, and materials and 
methods). By contrast, ESVs that were only tran- 
siently present during community assembly but 
were not part of the final stable community 
generally exhibited either negative average 
fitness values or equilibrium frequencies close 
to O (figs. S4 and S5). Overall, our quantitative 
analyses indicated that the ESVs that were pres- 
ent in the final transfer of our multispecies 
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dashed line. 


enrichment communities could invade from 
low frequency, fulfilling the mutual invisibility 
criterion of stable coexistence (36). 


Quantification of pairwise competition assays 


To empirically test whether stable multispe- 
cies coexistence was a pairwise phenomenon 
in our enrichment communities, we chose 12 
representative communities containing between 
five and 13 ESVs in stable equilibrium, plated 
them on their final transfer, and then selected 


ESVs found near the equilibrium in their respective community (fig. S3), 
indicating that these ESVs are subject to negative frequency-dependent 
selection. In these cases, we estimated the equilibrium frequency x* as 

the x-intercept of the regression line (figs. S3 and S4). (D) Two examples of 
invasion fitness analysis from the community in inset | showing negative 
frequency-dependent selection. The yellow line represents the linear fit 

as determined by least-squares regression (N = 11, R* = 0.92 and N = 11, R? = 
0.70 for the top and bottom panels, respectively). The x-intercept was used 
to estimate the equilibrium frequency x*, which is shown as a vertical 


at least three morphologically distinct isolates 
from each community (fig. S6 and materials 
and methods). Using Sanger sequencing, we 
obtained the full-length sequence of the 16S 
rRNA gene of these isolates, aligned it with the 
ESVs that were found in their communities of 
origin, and retained all isolates with at least 
200-base pair consensus sequence and four 
or fewer mismatches. This resulted in a total 
of 62 isolates, 40 with fully matching align- 
ments and 22 with one to four mismatches 
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Fig. 2. Multispecies coexistence is an emergent property of the community. 
(A) To determine whether isolated species pairs coexist or outcompete one 
another, we cultured each pair at three different initial frequencies. Pairs were 
propagated in the same culture conditions as their community of origin for 
eight consecutive passages. The pairwise competition outcomes of all 12 
enrichment communities are shown in (B), and communities are ordered 

by the number of strains in each community from the smallest (three taxa) to 
the largest (10 taxa). The numbers above in each bar show the number of ESVs, 
the number of isolated strains, and the number of tested pairs, respectively. 
Note that some communities have missing pairs because these pairs either 

did not have any colonies in co-culture or had low classification model accuracy. 
(C) Competition outcomes of 144 pairwise co-cultures. Mean frequencies and 


95% confidence intervals were determined by Poisson sampling (N = 1000; 
see the materials and methods). For clarity, we plotted in all cases the frequency 
of the isolate ending with a lower average frequency in time point 8 (Tg). In 
ansfer is 


coexisting pairs, the mean equi 


librium frequency on the final t 


represented by a horizontal dashed line, and the 95% confidence interval 


(computed from Poisson samp 
of the inset grids indicates the 
(To) to the final one (Tg). The 
outcomes, and the line color in 


ing, N = 1000) as a shaded 
change in frequency from t 


area around it. Each 
he initial time point 


background color represents the competition 


dicates the three initial freq 


uencies. To establish 


significant changes in frequency between Tp and Tg in each experiment, we 
used Wilcoxon—Mann-Whitney tests with N = 2000 and a significance threshold 
of P < 0.05 (see the materials and methods). 


(fig. S7 and materials and methods), covering 
on average 89.4% of the ESV composition of 
the original communities (Fig. 1B). 

We then performed every possible pairwise 
competition experiment among the isolates of 
each community by mixing inocula of pairs of 
isolates and passaging each mixture for eight 
growth-dilution cycles in the same glucose 
minimal medium at the same temperature 
(30°C) used in the original community enrich- 
ment experiments (Fig. 2A). All pairwise com- 
petition experiments were performed three 
times, each at a different starting count pro- 
portion of ~5:95, ~50:50, and ~95:5 (Fig. 2A 
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and materials and methods). During each 
growth cycle, the cells were incubated for 
48 hours, after which the resulting culture 
was diluted 125-fold into fresh medium, as 
was done in the original community assembly 
experiment (9). At the end of the last dilution 
cycle, we measured the composition of our 
pairwise co-cultures by plating them on Petri 
dishes and counting the colonies belonging to 
each isolate. 

To avoid human bias in colony morphology 
identification, we adopted an automated image- 
processing pipeline (fig. S8) combined with a 
machine-learning approach for classification 


using 159 x 3 = 477 co-culture images on the 
basis of 40 colony morphology features (figs. 
S9 and S10, table S1, and supplementary mate- 
rials). The pipeline started by extracting color 
channels and correcting for uneven backgrounds, 
followed by segmenting colony objects and ex- 
tracting the morphological features from these. 
These colony features were analyzed using ran- 
dom forest classification to determine whether 
each colony present in the co-culture image 
belonged to one morphotype or another (fig. 
$10). This approach allowed us to quantify the 
number of colony-forming units of each of 
the two competitors in pairwise co-culture. Of 
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Fig. 3. Competitive hierarchy prevails among species pairs in stably coexist- 
ing communities. (A) All isolates in our 12 communities were rank-ordered from top 
to bottom on the basis of the number of other isolates that they excluded in pairwise 
competition, using data from the experiments shown in Fig. 2. The gray nodes in 
the network represent each individual isolate. Red arrows point from the winning 


the 159 competing pairs, six did not yield a 
measurable optical density in either of the 
three competition assays regardless of their 
inoculation frequencies, and no colonies were 
detected. Because we could assign neither co- 
existence nor competitive exclusion to these 
pairs, which were also formed by pairs of 
isolates that were not present at the final 
transfer in monoculture, we excluded them 
from further analysis. We removed nine ad- 
ditional pairs for which the trained model 
performed poorly on the validation datasets 
(accuracy score <0.9; fig. S11 and materials 
and methods). We therefore used N = 144 
pairs in our analysis. The automated pipe- 
line approach agreed well with visual colony 
identification, yielding comparable results for 
both the total colony count on a plate [R? = 
0.85; root-mean-square deviation (RMSD) = 
17.67; N = 381] and the relative frequency 
of different colony morphotypes (R? = 0.87; 
RMSD = 0.17; N = 381) (fig. $12) for the 127 
pairs with an accuracy score > 0.9 that could 
be discriminated by eye. 


Multispecies coexistence is an 

emergent property 

In 26.4% of the pairs (38/144), one of the two 
competitors had become competitively excluded 
by the end of the last dilution cycle in all three 
competition experiments (i.e., no colonies 
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were detected on the plates) regardless of its 
starting inoculation proportion (Fig. 2, B and 
C, dark red). We marked these outcomes as 
competitive exclusion. For 45.1% of the pairs 
(65/144), the frequency of the losing species 
declined (AF < 0) in all three competition 
experiments regardless of its initial propor- 
tion (Fig. 2, B and C, light red box, and 
materials and methods). This indicates that 
its trajectory was on the path to competitive 
exclusion. Adding these outcomes to the com- 
petitive exclusion category, we found that 
71.6% of the pairs (103/144) failed to coexist 
in the absence of the other community mem- 
bers. These results were not driven by the poor 
competitive ability of the least-abundant ESVs, 
because eliminating from the analysis those 
isolates with ESVs with <0.05 frequency in the 
stable multispecies communities still produced 
a majority of competitive exclusion outcomes 
(61/84 = 72.6%) (fig. S13). 

All 12 communities contained at least one 
pair, but generally more, that could not coexist 
in isolation (Fig. 2B). The fraction of pairs ex- 
hibiting competitive exclusion was similar 
across communities regardless of their rich- 
ness (Fig. 2B). These results are not consistent 
with the additive assembly rule proposed pre- 
viously (14), which would have predicted less- 
diverse communities composed only of those 


ition to the losing one. Blue lines connect isolates that 
ution with N = 77, P = 0.25 showing the expected number 
ity if we randomly swapped the coexistence and exclusion 
rcle marks the experimentally determined number of 

in our case, Zero). 


fore, complex multispecies coexistence could 
not be reduced to pairwise relationships in our 
communities, and it is thus likely an emergent 
property of the whole community. 

A substantial fraction of pairwise competi- 
tions (28.5% of the pairs, 41/144) did not result 
in competitive exclusion, indicating that pair- 
wise coexistence may still be common among 
members of a stable multispecies community. 
Among these 41 pairs, 29 were still coexisting 
in all three competition experiments after eight 
transfers (Fig. 2, B and C, blue). To identify 
those pairs that coexist stably, we apply the 
mutual invasibility criterion, which requires 
that both species must be able to invade each 
other from low frequency (36) (Fig. 2, B and C, 
dark blue). Methodologically, this requires that 
sign(Azv) = sign[a* - 2(To)] for both species in 
all three pairwise competition experiments 
(see the materials and methods). Here, Ax de- 
notes the change in a species frequency be- 
tween the final and initial transfers, x* is the 
equilibrium frequency for that species (which 
we determined by averaging the final transfer 
frequencies of the three experiments; see the 
materials and methods), and 2#(To) is the spe- 
cies’ inoculation frequency on the first day of 
the experiment. This condition was met in 21 
of the 29 coexisting pairs. The criteria for mu- 
tual invasibility were not met in the remaining 


taxa that can coexist in isolated pairs. There- 


eight coexisting pairs, so we classified these as 
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coexisting without evidence of mutual invasi- 
bility (Fig. 2, B and C, light blue). The remain- 
ing fraction of our pairs (12/144, 8.3%) did not 
offer conclusive results because the outcome 
of the competition was not consistent in the 
three experiments. We left these as inconclu- 
sive (Fig. 2, B and C, gray). 


Competitive exclusion is hierarchical and transitive 

In an effort to better explore the structure of 
our pairwise competition network, we used 
the competition outcomes shown in Fig. 2 to 
rank all isolates in each community by the 
number of competitors that each of them ex- 
cluded (see the materials and methods). We 
found that competitive exclusion was almost 
fully hierarchical: In all but one of the 103 pairs 
in which one of the two isolates was excluded, 
the lower-rank species was the one that was 
excluded (Fig. 3A). The ranks in the competi- 
tive hierarchy were positively correlated with 
the frequency rank of the corresponding ESV 
in the parent community (Spearman’s p = 0.42, 
P< 0.001, N = 62; fig. S14), but this pattern was 
mostly driven by less-diverse communities (fig. 
S14), which recapitulates previous findings from 
plant communities (37). We also found that 
competitive hierarchy was positively correlated 
with the strain’s growth rate in glucose medium 
(Pearson’s 7 = -0.314, P = 0.0129, N = 62; fig. S15). 
Regarding the type of metabolism, respiro- 
fermenters had a higher average competitive 
rank (mean = 2.54) than obligate respirers 
(mean = 4.72) (Wilcoxon-Mann-Whitney test 
P< 0.001, N = 62; fig. S16), a pattern consistent 
with our previous work (9, 28). 

An extreme case of emergent coexistence 
may occur when coexistence networks are non- 
transitive (17, 24). However, we found that 
nontransitive cycles were unlikely to stabilize 
coexistence in our communities. Of 77 triplets 
of species that could be connected by compet- 
itive exclusion links in our 12 communities, we 
did not find a single violation of transitivity 
(Fig. 3B). Because the expected fraction of 
nontransitive triplets in a random network is 
P = 1/4, the probability of observing this out- 
come by chance is given by P(0) = (1/4)” = 
4.4 x 10~*” (Fig. 3B). 


Discussion 


The aim of this study was to empirically test 
whether coexistence in microbial communi- 
ties is a pairwise phenomenon or if it is an 
emergent property of the community. To ad- 
dress this question, we isolated most mem- 
bers of 12 stable enrichment communities 
and determined whether each possible pair 
could coexist in the absence of the other mem- 
bers of their communities under the same cul- 
ture conditions as in the enrichment. Although 
a substantial fraction of pairs did coexist 
(29/144, 20.1%), a majority (103/144, 71.5%) 
of them ended up in competitive exclusion, 
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with one of the two members becoming ex- 
cluded or on the path to it. This indicates that 
coexistence could not be reduced to a pairwise 
phenomenon in our enrichment communities 
and that the community context is generally 
required for species pairs to coexist. Our find- 
ing contrasts with the outcome of a recent 
empirical study supporting the reductionist 
hypothesis, which concluded that the coex- 
istence of multiple species in bottom-up as- 
sembled communities requires every pair to 
coexist in isolation (14). 

Given that both hypotheses can be correct 
in different communities (/4, 16), our results 
prompt the question of under which condi- 
tions each is most likely to occur. We have not 
yet determined whether the complex nature 
of multispecies coexistence in our enrichment 
communities derives from higher-order inter- 
actions, or if it can be explained by a complex 
network of pairwise interactions. Another pos- 
sible factor that may stabilize coexistence, but 
which our study has not addressed, is the rapid 
emergence of intra-strain diversity through evo- 
lutionary processes. Evolution of new species 
interactions, such as the appearance of a new 
mutualism, may mediate the emergent coex- 
istence of pairs of strains that would other- 
wise end in competitive exclusion (16). As for 
broader evolutionary patterns, we did not find 
a correlation between pairwise coexistence 
and sequence similarity (fig. S17), although our 
analysis was limited to the 16S marker gene. 
Finally, spatial structure is also known to af- 
fect microbial coexistence [e.g., (38)], but the 
number and nature of spatial niches could not 
be identified in a straightforward manner in 
our experiments. 

Theoretical studies have suggested that 
nontransitivity can stabilize the coexistence 
of multiple competing species in the presence 
of spatial heterogeneity (8) or when com- 
petitors have differential competitive abil- 
ities on multiple limiting resources (17, 2). 
Although the idea of nontransitivity is well 
established in theory, empirical studies on 
its prevalence are sparse. Our mutual inva- 
sion experiments with 144 species pairs from 
each of the 12 communities did not find a 
single nontransitive trio, suggesting strongly 
hierarchical competition among our species. 
This discrepancy between theory and our 
findings may be caused by the underlying eco- 
logical interactions among competing spe- 
cies. In our communities, exploitation of the 
single externally supplied limiting nutrient 
and cross-feeding appeared to be the domi- 
nant ecological interactions determining the 
community structure (9, 28), whereas non- 
transitivity may emerge through interference 
competition (39) or through changes in spe- 
cies’ competitiveness across resources (40). 

Our experiments suggest that pairwise co- 
existence is not necessarily required for the 


stable assembly of multispecies communities. 
However, more complex assembly rules might 
still be found to predict and explain multispe- 
cies coexistence. Future empirical work with 
communities assembled under growing envi- 
ronmental complexity will be necessary to es- 
tablish how factors such as spatial structure, 
the number of supplied resources, the existence 
of higher-order interactions, and fluctuating 
conditions may influence the complexity of 
coexistence in multispecies communities. 
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Finding my postdoc identity 


lease do not leave the lab before I graduate,” the Ph.D. student said. He needn’t have worried; 
I have no plans to leave my postdoc position just yet. He said it half-jokingly over a lab lunch 
as we were talking about another lab member who had recently moved on and others who may 
be going soon, but I think he was sincere. I was touched, as I have done my best to support and 
guide the graduate students in the lab. His words also helped put to rest old worries about what 


kind of postdoc I would become. 


When I was a graduate student, go- 
ing on to a postdoc seemed the ob- 
vious next step. I loved conducting 
research and wanted to get more 
experience, ask big questions, and 
make important discoveries. But 
I didn’t really know what the job 
entailed. I knew a handful of post- 
docs and had the perception that 
they were extremely smart and 
productive. But there were none 
in my lab and very few in my de- 
partment. With so few examples 
to follow, I worried about how I 
would know what is expected of a 
postdoc—and how I would live up 
to those expectations. 

Before I graduated, I sought 
advice from faculty and mentors 
about how to establish myself as 
a postdoc. One told me to follow 
my passion and curiosity, another 
said not to forget to have fun, a 
third told me to be myself and I would be fine. This advice 
helped reassure me that I would find my way—though I 
still didn’t know exactly what that might look like. 

During my postdoc interview, my future adviser and I 
discussed his expectations for me, including finishing my 
main project before leaving. We also talked about mentor- 
ing, funding, and professional development. I was begin- 
ning to have a more solid vision of what being a postdoc 
would mean. And when I joined the lab, I figured I could 
learn more by observing the senior postdocs. But then 
came the pandemic, and we were all working on our own. 
I realized I would need to figure out my postdoc personal- 
ity on my own. 

In graduate school, I had always enjoyed helping my 
colleagues with their work—talking through research 
challenges, brainstorming ideas for new approaches, as- 
sisting with data analysis. I hesitated to continue to play 
that role as a postdoc, as I was navigating a new lab and 
nurturing relationships with new colleagues while also 


“| realized | would 
need to figure out my postdoc 
personality on my own.” 


trying to prove myself and make 
progress on my own research. 

But gradually my old impulses 
kicked in. When lab mates talked 
about their work, I couldn’t resist 
trying to help them. And I was 
gratified to find that my help was 
welcomed, and even useful. Maybe 
I did have something to offer. 
Maybe my identity as a postdoc 
was much the same as when I was 
a grad student: a helper, this time 
with a little more experience. 

To be sure, I sometimes struggle 
to advance my own work when 
others are coming to me multiple 
times a day with ideas or ques- 
tions. My adviser noticed this, too. 
He has commended me for help- 
ing the other members of the lab 
but noted that I need to prioritize 
my own work and protect my time 
as well. 

His advice might have spurred me to focus only on my 
research. I know others who take this approach; after all, 
the postdoc is a make-or-break stage of an academic re- 
search career, when we need to amass publications to have 
a chance at faculty jobs. 

But I knew I couldn’t deny my nature, nor did I want to. 
I had found my postdoc identity, and I was going to keep 
it—maybe with a little adjusting. These days, when I need 
to focus on something I place a note on my desk saying 
I am taking a “vow of silence,” during which I can’t talk 
to others. That has helped establish a balance that works 
for me. 

I’ve been a postdoc for about 3 years now. I am con- 
tent with my research progress and happy with how much 
I have learned. Just as important, I’m grateful to have the 
opportunity to help others. It is a privilege I embrace. 


Moamen Elmassry is a postdoctoral research fellow at Princeton 
University. Send your career story to SciCareerEditor@aaas.org. 
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